- Basic Python and data analysis skills
Audience
- Python developer
- Data analysts
Python is a versatile programming language known for its simplicity and readability. Pandas is a Python package that provides data structures for working with structured (tabular, multidimensional, potentially heterogeneous) and time series data. Numpy provides fundamental support for numerical computing with its array operations. Together, they form a robust ecosystem for efficient data handling and analysis in Python.
This instructor-led, live training (online or onsite) is aimed at intermediate-level Python developers and data analysts who wish to enhance their skills in data analysis and manipulation using Pandas and NumPy.
By the end of this training, participants will be able to:
- Set up a development environment that includes Python, Pandas, and NumPy.
- Create a data analysis application using Pandas and NumPy.
- Perform advanced data wrangling, sorting, and filtering operations.
- Conduct aggregate operations and analyze time series data.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Day 1:
Basic Python and Data Analysis Skills Review
Introduction to NumPy
- Creating NumPy arrays
- Common operations on matrices
- Using ufuncs
- Views and broadcasting on NumPy arrays
- Optimizing performance by avoiding loops
- Optimizing performance with cProfile
Data Analysis with Pandas
- Using vectorized data in pandas
- Data wrangling
- Sorting and filtering data
- Aggregate operations
- Analyzing time series
MySQL Database visualized with Tableau
Day 2:
Other Python Libraries for Data Analysis
- scikit-learn
- Scipy
- statsmodel
- RPy2
Summary and Next Steps