- Familiarity with Apache Spark
- Python programming experience
Audience
- Data scientists
- Developers
Spark NLP is an open source library, built on Apache Spark, for natural language processing with Python, Java, and Scala. It is widely used for enterprise and industry verticals, such as healthcare, finance, life science, and recruiting.
This instructor-led, live training (online or onsite) is aimed at data scientists and developers who wish to use Spark NLP, built on top of Apache Spark, to develop, implement, and scale natural language text processing models and pipelines.
By the end of this training, participants will be able to:
- Set up the necessary development environment to start building NLP pipelines with Spark NLP.
- Understand the features, architecture, and benefits of using Spark NLP.
- Use the pre-trained models available in Spark NLP to implement text processing.
- Learn how to build, train, and scale Spark NLP models for production-grade projects.
- Apply classification, inference, and sentiment analysis on real-world use cases (clinical data, customer behavior insights, etc.).
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Introduction
- Spark NLP vs NLTK vs spaCy
- Overview of Spark NLP features and architecture
Getting Started
- Setup requirements
- Installing Spark NLP
- General concepts
Using Pre-trained Pipelines
- Importing required modules
- Default annotators
- Loading a pipeline model
- Transforming texts
Building NLP Pipelines
- Understanding the pipeline API
- Implementing NER models
- Choosing embeddings
- Using word, sentence, and universal embeddings
Classification and Inference
- Document classification use cases
- Sentiment analysis models
- Training a document classifier
- Using other machine learning frameworks
- Managing NLP models
- Optimizing models for low-latency inference
Troubleshooting
Summary and Next Steps
United Arab Emirates - Scaling Data Pipelines with Spark NLP
Qatar - Scaling Data Pipelines with Spark NLP
Egypt - Scaling Data Pipelines with Spark NLP
Saudi Arabia - Scaling Data Pipelines with Spark NLP
South Africa - Scaling Data Pipelines with Spark NLP
Brasil - Scaling Data Pipelines with Spark NLP
Canada - Scaling Data Pipelines with Spark NLP
中国 - Scaling Data Pipelines with Spark NLP
香港 - Scaling Data Pipelines with Spark NLP
澳門 - Scaling Data Pipelines with Spark NLP
台灣 - Scaling Data Pipelines with Spark NLP
USA - Scaling Data Pipelines with Spark NLP
Österreich - Scaling Data Pipelines with Spark NLP
Schweiz - Scaling Data Pipelines with Spark NLP
Deutschland - Scaling Data Pipelines with Spark NLP
Czech Republic - Scaling Data Pipelines with Spark NLP
Denmark - Scaling Data Pipelines with Spark NLP
Estonia - Scaling Data Pipelines with Spark NLP
Finland - Scaling Data Pipelines with Spark NLP
Greece - Scaling Data Pipelines with Spark NLP
Magyarország - Scaling Data Pipelines with Spark NLP
Ireland - Scaling Data Pipelines with Spark NLP
Luxembourg - Scaling Data Pipelines with Spark NLP
Latvia - Scaling Data Pipelines with Spark NLP
España - Scaling Data Pipelines with Spark NLP
Italia - Scaling Data Pipelines with Spark NLP
Lithuania - Scaling Data Pipelines with Spark NLP
Nederland - Scaling Data Pipelines with Spark NLP
Norway - Scaling Data Pipelines with Spark NLP
Portugal - Scaling Data Pipelines with Spark NLP
România - Scaling Data Pipelines with Spark NLP
Sverige - Scaling Data Pipelines with Spark NLP
Türkiye - Scaling Data Pipelines with Spark NLP
Malta - Scaling Data Pipelines with Spark NLP
Belgique - Scaling Data Pipelines with Spark NLP
France - Scaling Data Pipelines with Spark NLP
日本 - Scaling Data Pipelines with Spark NLP
Australia - Scaling Data Pipelines with Spark NLP
Malaysia - Scaling Data Pipelines with Spark NLP
New Zealand - Scaling Data Pipelines with Spark NLP
Philippines - Scaling Data Pipelines with Spark NLP
Singapore - Scaling Data Pipelines with Spark NLP
Thailand - Scaling Data Pipelines with Spark NLP
Vietnam - Scaling Data Pipelines with Spark NLP
India - Scaling Data Pipelines with Spark NLP
Argentina - Scaling Data Pipelines with Spark NLP
Chile - Scaling Data Pipelines with Spark NLP
Costa Rica - Scaling Data Pipelines with Spark NLP
Ecuador - Scaling Data Pipelines with Spark NLP
Guatemala - Scaling Data Pipelines with Spark NLP
Colombia - Scaling Data Pipelines with Spark NLP
México - Scaling Data Pipelines with Spark NLP
Panama - Scaling Data Pipelines with Spark NLP
Peru - Scaling Data Pipelines with Spark NLP
Uruguay - Scaling Data Pipelines with Spark NLP
Venezuela - Scaling Data Pipelines with Spark NLP
Polska - Scaling Data Pipelines with Spark NLP
United Kingdom - Scaling Data Pipelines with Spark NLP
South Korea - Scaling Data Pipelines with Spark NLP
Pakistan - Scaling Data Pipelines with Spark NLP
Sri Lanka - Scaling Data Pipelines with Spark NLP
Bulgaria - Scaling Data Pipelines with Spark NLP
Bolivia - Scaling Data Pipelines with Spark NLP
Indonesia - Scaling Data Pipelines with Spark NLP
Kazakhstan - Scaling Data Pipelines with Spark NLP
Moldova - Scaling Data Pipelines with Spark NLP
Morocco - Scaling Data Pipelines with Spark NLP
Tunisia - Scaling Data Pipelines with Spark NLP
Kuwait - Scaling Data Pipelines with Spark NLP
Oman - Scaling Data Pipelines with Spark NLP
Slovakia - Scaling Data Pipelines with Spark NLP
Kenya - Scaling Data Pipelines with Spark NLP
Nigeria - Scaling Data Pipelines with Spark NLP
Botswana - Scaling Data Pipelines with Spark NLP
Slovenia - Scaling Data Pipelines with Spark NLP
Croatia - Scaling Data Pipelines with Spark NLP
Serbia - Scaling Data Pipelines with Spark NLP
Bhutan - Scaling Data Pipelines with Spark NLP