Ingest data from Amazon S3 to HDFS
The S3 to HDFS Sync Application Template continuously ingests files as blocks from the configured Amazon S3 location to the destination path in HDFS, retaining one-to-one file traceability.
Features of the enterprise-grade application template include these advantages:
- The application scales linearly with the number of block readers
- The application is fault-tolerant and can withstand node and cluster outages without data loss
- Highly performant, the application can perform as fast as network bandwidth allows
- Configuration is simple: users need only provide source S3 connection parameters, bucket, directory and destination HDFS path, and filename
- Dramatic reduction in time-to-market and cost of operations
Download the application template and launch it to ingest your files from Amazon S3 to HDFS. Follow the tutorial videos or walkthrough document below to launch the template and run it.
Import, configure, and launch application template
Have feedback or want to learn more?