Introduction
AWS IoT SiteWise is a managed service that helps clients gather, retailer, manage and monitor information from their industrial tools at scale. Clients usually must carry their historic tools measurement information from current programs resembling information historians and time sequence databases into AWS IoT SiteWise for making certain information continuity, coaching synthetic intelligence (AI) & machine studying (ML) fashions that may predict tools failures, and deriving actionable insights.
On this weblog put up, we are going to present how one can get began with the BulkImportJob API and import historic tools information into AWS IoT SiteWise utilizing a code pattern.
You need to use this imported information to realize insights by way of AWS IoT SiteWise Monitor and Amazon Managed Grafana, prepare ML fashions on Amazon Lookout for Gear and Amazon SageMaker, and energy analytical purposes.
To start a bulk import, clients must add a CSV file to Amazon Easy Storage Service (Amazon S3) containing their historic information in a predefined format. After importing the CSV file, clients can provoke the asynchronous import to AWS IoT SiteWise utilizing the CreateBulkImportJob operation, and monitor the progress utilizing the DescribeBulkImportJob and ListBulkImportJob operations.
Stipulations
To observe by way of this weblog put up, you have to an AWS account and an AWS IoT SiteWise supported area. In case you are already utilizing AWS IoT SiteWise, select a distinct area for an remoted atmosphere. You might be additionally anticipated to have some familiarity with Python.
Setup the atmosphere
- Create an AWS Cloud9 atmosphere utilizing
Amazon Linux 2
platform - Utilizing the terminal in your Cloud9 atmosphere, set up Git and clone the sitewise-bulk-import-example repository from Github
sudo yum set up git git clone https://github.com/aws-samples/aws-iot-sitewise-bulk-import-example.git cd aws-iot-sitewise-bulk-import-example pip3 set up -r necessities.txt
Walkthrough
For the demonstration on this put up, we are going to use an AWS Cloud9 occasion to characterize an on-premises developer workstation and simulate two months of historic information for a number of manufacturing traces in an vehicle manufacturing facility.
We’ll then put together the info and import it into AWS IoT SiteWise at scale, leveraging a number of bulk import jobs. Lastly, we are going to confirm whether or not the info was imported efficiently.
A bulk import job can import information into the 2 storage tiers provided by AWS IoT SiteWise, relying on how the storage is configured. Earlier than we proceed, allow us to first outline these two storage tiers.
Scorching tier: Shops steadily accessed information with decrease write-to-read latency. This makes the new tier excellent for operational dashboards, alarm administration programs, and every other purposes that require quick entry to the current measurement values from tools.
Chilly tier: Shops less-frequently accessed information with increased learn latency, making it excellent for purposes that require entry to historic information. As an illustration, it may be utilized in enterprise intelligence (BI) dashboards, synthetic intelligence (AI), and machine studying (ML) coaching. To retailer information within the chilly tier, AWS IoT SiteWise makes use of an S3 bucket within the buyer’s account.
Retention Interval: Determines how lengthy your information is saved within the sizzling tier earlier than it’s deleted.
Now that we discovered in regards to the storage tiers, allow us to perceive how a bulk import job handles writes for various situations. Check with the desk under:
Worth | Timestamp | Write Conduct |
New | New | A brand new information level is created |
New | Present | Present information level is up to date with the brand new worth for the offered timestamp |
Present | Present | The import job identifies duplicate information and discards it. No modifications are made to current information. |
Within the subsequent part, we are going to observe step-by-step directions to import historic tools information into AWS IoT SiteWise.
Steps to import historic information
Step 1: Create a pattern asset hierarchy
For the aim of this demonstration, we are going to create a pattern asset hierarchy for a fictitious vehicle producer with operations throughout 4 completely different cities. In a real-world state of affairs, chances are you’ll have already got an current asset hierarchy in AWS IoT SiteWise, by which case this step is optionally available.
Step 1.1: Evaluate the configuration
- From terminal, navigate to the foundation of the Git repo.
- Evaluate the configuration for asset fashions and property.
cat config/assets_models.yml
- Evaluate the schema for asset properties.
cat schema/sample_stamping_press_properties.json
Step 1.2: Create asset fashions and property
- Run
python3 src/create_asset_hierarchy.py
to mechanically create asset fashions, hierarchy definitions, property, asset associations. - Within the AWS Console, navigate to AWS IoT SiteWise, and confirm the newly created Fashions and Belongings.
- Confirm that you just see the asset hierarchy much like the one under.
Step 2: Put together historic information
Step 2.1: Simulate historic information
On this step, for demonstration goal, we are going to simulate two months of historic information for 4 stamping presses throughout two manufacturing traces. In a real-world state of affairs, this information would usually come from supply programs resembling information historians and time sequence databases.
The CreateBulkImportJob API has the next key necessities:
- To determine an asset property, you have to to specify both an
ASSET_ID
+PROPERTY_ID
mixture or theALIAS.
On this weblog, we can be utilizing the previous. - The information must be in CSV format.
Observe the steps under to generate information in accordance with these expectations. For extra particulars in regards to the schema, consult with Ingesting information utilizing the CreateBulkImportJob API.
- Evaluate the configuration for information simulation.
cat config/data_simulation.yml
- Run
python3 src/simulate_historical_data.py
to generate simulated historic information for the chosen properties and time interval. If the full rows exceedrows_per_job
as configured inbulk_import.yml
, a number of information recordsdata can be created to help parallel processing. On this pattern, about 700,000+ information factors are simulated for the 4 stamping presses (A-D) throughout two manufacturing traces (Sample_Line 1 and Sample_Line 2). Since we configuredrows_per_job
as 20,000, a complete of 36 information recordsdata can be created. - Confirm the generated information recordsdata underneath
information
listing. - The information schema will observe the
column_names
configured inbulk_import.yml
config file.
Step 2.2: Add historic information to Amazon S3
As AWS IoT SiteWise requires the historic information to be obtainable in Amazon S3, we are going to add the simulated information to the chosen S3 bucket.
- Replace the info bucket underneath
bulk_import.yml
with any current momentary S3 bucket that may be deleted later. - Run
python3 src/upload_to_s3.py
to add the simulated historic information to the configured S3 bucket. - Navigate to Amazon S3 and confirm the objects had been uploaded efficiently.
Step 3: Import historic information into AWS IoT SiteWise
Earlier than you may import historic information, AWS IoT SiteWise requires that you just allow Chilly tier storage. For extra particulars, consult with Configuring storage settings.
If in case you have already activated chilly tier storage, take into account modifying the S3 bucket to a short lived one which may be later deleted whereas cleansing up the pattern sources.
Be aware that by altering the S3 bucket, not one of the information from current chilly tier S3 bucket is copied to the brand new bucket. When modifying S3 bucket location, make sure the IAM position configured underneath S3 entry position has permissions to entry the brand new S3 bucket.
Step 3.1: Configure storage settings
- Navigate to AWS IoT SiteWise, choose Storage, then choose Activate chilly tier storage.
- Decide an S3 bucket location of your alternative.
- Choose Create a task from an AWS managed template.
- Verify Activate retention interval, enter
30 days
, and save.
Step 3.2: Present permissions for AWS IoT SiteWise to learn information from Amazon S3
- Navigate to AWS IAM, choose Insurance policies underneath Entry administration, and Create coverage.
- Change to JSON tab and change the content material with the next. Replace <bucket-name> with the title of knowledge S3 bucket configured in
bulk_import.yml
.{ "Model": "2012-10-17", "Assertion": [ { "Effect": "Allow", "Action": [ "s3:*" ], "Useful resource": ["arn:aws:s3:::<bucket-name>"] } ] }
- Save the coverage with Identify as
SiteWiseBulkImportPolicy
. - Choose Roles underneath Entry administration, and Create position.
- Choose Customized belief coverage and change the content material with the next.
{ "Model": "2012-10-17", "Assertion": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "iotsitewise.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }
- Click on Subsequent and choose the
SiteWiseBulkImportPolicy
IAM coverage created within the earlier steps. - Click on Subsequent and create the position with Function title as
SiteWiseBulkImportRole
. - Choose Roles underneath Entry administration, seek for the newly created IAM position
SiteWiseBulkImportRole
, and click on on its title. - Copy the ARN of the IAM position utilizing the copy icon.
Step 3.3: Create AWS IoT SiteWise bulk import jobs
- Substitute the
role_arn
discipline inconfig/bulk_import.yml
with the ARN ofSiteWiseBulkImportRole
IAM position copied in earlier steps. - Replace the
config/bulk_import.yml
file:- Substitute the
role_arn
with the ARN ofSiteWiseBulkImportRole
IAM position. - Substitute the
error_bucket
with any current momentary S3 bucket that may be deleted later.
- Substitute the
- Run
python3 src/create_bulk_import_job.py
to import historic information from the S3 bucket into AWS IoT SiteWise: - The script will create a number of jobs to concurrently import all the info recordsdata created into AWS IoT SiteWise. In a real-world state of affairs, a number of terabytes of knowledge may be rapidly imported into AWS IoT SiteWise utilizing concurrently working jobs.
- Verify the standing of jobs from the output:
- Should you see the standing of any job as
COMPLETED_WITH_FAILURES
orFAILED
, consult with Troubleshoot frequent points part.
Step 4: Confirm the imported information
As soon as the majority import jobs are accomplished, we have to confirm if the historic information is efficiently imported into AWS IoT SiteWise. You’ll be able to confirm the info both by immediately trying on the chilly tier storage or by visually inspecting the charts obtainable in AWS IoT SiteWise Monitor.
Step 4.1: Utilizing the chilly tier storage
On this step, we are going to examine if new S3 objects have been created within the bucket that was configured for chilly tier.
- Navigate to Amazon S3 and find the S3 bucket configured underneath AWS IoT SiteWise → Storage → S3 bucket location (in Step 3) for chilly tier storage.
- Confirm the partitions and objects underneath the
uncooked/
prefix.
Step 4.2: Utilizing AWS IoT SiteWise Monitor
On this step, we are going to visually examine if the charts present information for the imported date vary.
- Navigate to AWS IoT SiteWise and find Monitor.
- Create a portal to entry information saved in AWS IoT SiteWise.
- Present
AnyCompany Motor
because the Portal title. - Select
IAM
for Person authentication. - Present your e mail deal with for Assist contact e mail, and click on Subsequent.
- Depart the default configuration for Extra options, and click on Create.
- Underneath Invite directors, choose your IAM person or IAM Function, and click on Subsequent.
- Click on on Assign Customers.
- Present
- Navigate to Portals and open the newly created portal.
- Navigate to Belongings and choose an asset, for instance, AnyCompany_Motor → Sample_Arlington → Sample_Stamping → Sample_Line 1 → Sample_Stamping Press A.
- Use Customized vary to match the date vary for the info uploaded.
- Confirm the info rendered within the time sequence line chart.
Troubleshoot frequent points
On this part, we are going to cowl the frequent points encountered whereas importing information utilizing bulk import jobs and spotlight some attainable causes.
If a bulk import job shouldn’t be efficiently accomplished, it’s best follow to consult with logs within the error S3 bucket configured in bulk_import.yml
and perceive the foundation trigger.
No information imported
- Incorrect schema:
dataType doesn't match dataType tied to the asset-property
The schema offered at Ingesting information utilizing the CreateBulkImportJob API ought to be adopted precisely. Utilizing the console, confirm the offered DATA_TYPE offered matches with the info sort within the corresponding asset mannequin property. - Incorrect ASSET_ID or PROPERTY_ID:
Entry shouldn't be modeled
Utilizing the console, confirm the corresponding asset and property exists. - Duplicate information:
A worth for this timestamp already exists
AWS IoT SiteWise detects and mechanically discards any duplicate. Utilizing console, confirm if the info already exists.
Lacking solely sure components of knowledge
- Lacking current information: BulkImportJob API imports the current information (that falls inside the sizzling tier retention interval) into AWS IoT SiteWise sizzling tier and doesn’t switch it instantly to Amazon S3 (chilly tier). You could want to attend for the subsequent sizzling to chilly tier switch cycle, which is at the moment set to six hours.
Clear Up
To keep away from any recurring expenses, take away the sources created on this weblog. Observe the steps to delete these sources:
- Navigate to AWS Cloud9 and delete your atmosphere.
- Run
python3 src/clean_up_asset_hierarchy.py
to delete the next sources, so as, from AWS IoT SiteWise:- Asset associations
- Belongings
- Hierarchy definitions from asset fashions
- Asset fashions
- From AWS IoT SiteWise console, navigate to Monitor → Portals, choose the beforehand created portal, and delete.
- Navigate to Amazon S3 and carry out the next:
- Delete the
S3 bucket location
configured underneath the Storage part of AWS IoT SiteWise - Delete the info and error buckets configured within the
/config/bulk_import.yml
of Git repo
- Delete the
Conclusion
On this put up, you will have discovered the way to use the AWS IoT SiteWise BulkImportJob API to import historic tools information into AWS IoT SiteWise utilizing AWS Python SDK (Boto3). You may also use the AWS CLI or SDKs for different programming languages to carry out the identical operation. To study extra about all supported ingestion mechanisms for AWS IoT SiteWise, go to the documentation.
In regards to the authors
Raju Gottumukkala is an IoT Specialist Options Architect at AWS, serving to industrial producers of their good manufacturing journey. Raju has helped main enterprises throughout the power, life sciences, and automotive industries enhance operational effectivity and income progress by unlocking true potential of IoT information. Previous to AWS, he labored for Siemens and co-founded dDriven, an Trade 4.0 Information Platform firm. |
Avik Ghosh is a Senior Product Supervisor on the AWS Industrial IoT crew, specializing in the AWS IoT SiteWise service. With over 18 years of expertise in expertise innovation and product supply, he focuses on Industrial IoT, MES, Historian, and large-scale Trade 4.0 options. Avik contributes to the conceptualization, analysis, definition, and validation of Amazon IoT service choices. |