AWS Data Wrangler logo

AWS Data Wrangler

Move pandas/spark dataframes across AWS services
5
28
+ 1
0

What is AWS Data Wrangler?

It is a utility belt to handle data on AWS. It aims to fill a gap between AWS Analytics Services (Glue, Athena, EMR, Redshift) and the most popular Python data libraries (Pandas, Apache Spark).
AWS Data Wrangler is a tool in the Data Science Tools category of a tech stack.
AWS Data Wrangler is an open source tool with GitHub stars and GitHub forks. Here’s a link to AWS Data Wrangler's open source repository on GitHub

Who uses AWS Data Wrangler?

Companies

Developers
4 developers on StackShare have stated that they use AWS Data Wrangler.

AWS Data Wrangler Integrations

Apache Spark, Amazon Athena, PySpark, Apache Parquet, and Neos CMS are some of the popular tools that integrate with AWS Data Wrangler. Here's a list of all 5 tools that integrate with AWS Data Wrangler.

AWS Data Wrangler's Features

  • Writes in Parquet and CSV file formats
  • Utility belt to handle data on AWS

AWS Data Wrangler Alternatives & Comparisons

What are some alternatives to AWS Data Wrangler?
NumPy
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.
Pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more.
SciPy
Python-based ecosystem of open-source software for mathematics, science, and engineering. It contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks common in science and engineering.
Anaconda
A free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment. Package versions are managed by the package management system conda.
Dataform
Dataform helps you manage all data processes in your cloud data warehouse. Publish tables, write data tests and automate complex SQL workflows in a few minutes, so you can spend more time on analytics and less time managing infrastructure.
See all alternatives

AWS Data Wrangler's Followers
28 developers follow AWS Data Wrangler to keep up with related blogs and decisions.