May 20th, 2021
These are notes I took from the 2021 Open-Source North conference, which takes place annually in Minneapolis, Minnesota
Deep Learning for Natural Language Processing
- Natural Language Understanding (NLU) is the focus of this talk
Natural Language Generation
- Mapping from computer representation space to language space
- Opposite direction of NLU
Deep Learning
- Subfield of machine learning
- Algorithms inspired by the structure and function of the brain called artificial neural networks
- Advantage over machine learning is to extract features automatically
Text is Messy
- Punctuation, typos, unknown words, etc.
Preprocessing Techniques
- Turn the text into meaningful format for analysis (tokenization)
- Clean the data
- Remove: upper case letters, punctuation, numbers, stop words
- Stemming
- Parts of speech tagging
- Correct misspelled words
- Chunking (named entity recognition, compound term extraction)
Preprocessing: stemming
Stemming and Lemmatization = cut word down to base form
- Stemming: uses rough heuristics to reduce words to base
- Lemmatization: uses vocabulary and morphological analysis
- Makes the meaning of run, runs, running, ran all the same
Bag of words
Way of representing text data when modeling text with machine learning or deep learning algorithms
Word embeddings
Type of word representation that allows words with similar meaning to have a similar representation
- They are a distributed representation of text
- Word embedding methods learn a real-valued vector representation for a predefined fixed sized vocabulary from a corpus of text
Word2Vec
- Problem: count vectors far too large for many documents
- Solution: Word2Vec reduces number of dimensions (configurable, e.g. 300)
- Problem: bag of words neglects word order
- SkipGrams: SkipGrams is a neural network architecture that uses a word to predict the words in the surrounding context, defined by the window size
- Continuous Bag of Words: CBOW uses the surrounding context
- What happens? Learns words likely to appear near each word
- Vector scan combined to create features for documents
- Use Document Vectors for ML/DL on documents (classification, etc.)
Feature Selection
Manual process in traditional machine learning techniques, which happens automatically in deep learning
Embeddings + CNN
Using word embeddings for representing words and convolution neural network for classification task. Architecture has 3 pieces:
- Word embedding model: generate word vectors
- Convolutional model: extracts salient features from documents
- Full-connected model: interpretation of extracted features in terms of a predictive model
Recurrent Neural Networks
- Networks with loops
- Allows information to persist
- Enables connecting previous information to present task
- Context preserved
Vanishing Gradients with RNN's
- In the simplest form RNN's don't work as well as wanted
- Learning rate drops with back propagation
- Long-Short-Term Memory units help combat the vanishing gradient problem by introducing an "error carousel".
- Allows learning sequences, keeping track of the order without a vanishing gradient
Major challenges with DL for NLP
- Data size: RNN/LSTM doesn't generalize well on small datasets
- Relevant Corpus: required to create domain specific word embedding
- Deeper Networks: empirically deeper networks have better accuracy
- Training Time: RNN's take a long time to learn
https://github.com/hardlyhuman
Robot Rock - AI and Music Composition
- Previous Examples of Music Generation:
- Mozart?! - developed game for auto-generating music
- ILIAC 1
- Neural Networks
Machine Learning used for Music Generation
- Standard feed forward networks aren't a good fit for predicting sequential events (e.g. music, text)
- Limitation: fixed number of inputs/outputs
Recurrent Neural Networks (RNN)
- Better for text/music
- LSTM is key to improving results
Music Encoding Options
- MIDI
- Waveform
Programs that Do Music Generation
- Amper
- AIVA
- Generational soundtracks for video games?!
- LANDR - AI based mastering of music
- Magenta (Google)
- OpenAI: MuseNet (built off of MuseTree), JukeBox
- PopGun
- Live Performance toolsTidal Cycles, Orca
Artists That Use AI
- Taryn Southern - I Am AI (2018)
- Yacht - Chain Tripping (2019)
- Transcribed entire backlog to MIDI to train Magenta
- Treated ML as a collaborator
- Holly Herndon - Proto (2019)
- Created "Spawn", which performed music
- She earned her PhD based on this album
Lyrics Generation
- GPT-2 has a model to generate lyrics
Empowering Streams through KSQL
- Querying Kafka streams through KSQL
- Custom Data Integration is hard: ephemeral isn't useful, stateful is hard
- Kafka: A-B integration allows loose coupling, with Kafka as the middle layer
- Kafka can handle load with a very predictive, linearly scaling, model
- Kafka partitions data with "Topics"
Kafka Data Transformation
- Single Message Transforms (SMT)
- Transformations configured via JSON
- KStreams: advanced message transforms in Java
- KStream - unending list of messages arriving
- KTable - a projection of the most recent value in a KStream
ksqlDB
Uses a SQL interface to work against KTables/KStreams
Emit changes
keyword continuously runs query- Has basic querying capabilities, and other functions, that work against Kafka streams
Links
- ksqldb.io
- Confluent open source
- Confluent runs cloud native Kafka distribution
Lessons on Chaos Engineering
Chaos engineering is an experiment, building an experiment around steady-state hypothesis.
- Not all signs are useful.
- "The future seems implausible, the past incredible"
- Weak signals are the signals we get before something goes wrong, and are an important insight into something before it goes wrong
- Search for how close we are to failure
- Past signals may not be future signals, future signals may come from areas that were not signalling before
Insights that Come From Weak Signals
- On-Call shifts should end on Fridays!
- Engineers are tired, and
- On-call shifts ended on Fridays and begin for the next person Friday
- A designated "ops-support" person
- "I don't know anything about this, we'll need to talk to Emma.": signalling the system is approaching a boundary - what happens if Emma decides to pursue other opportunities?
- Value proposition of chaos engineering is the insights you gain
- Rare that a single signal is strong enough
- Having a multi-functional product team is the best way to make products
Technical Excellence through Mob Programming
- Retrospectives: tie together learning time
- 1 year of no bugs! Organization chose to scale mob programming.
How to Mob Program
The Mob Programming RPG
https://github.com/willemlarsen/mobprogrammingrpg
- Driver: drive the PC
- Navigator: gives the directions on what to program
- Mobber: yield to less privileged voice, contributes ideas
- Researcher: break off on tangents to look into different ideas
- Sponsor: speak-up for others
- Navigator of the Navigator: navigates the navigator!
- Automationist: sees a developer doing the same thing over and over again, might be able to automate those things
- The Nose: calls out code smells
- Traffic Cop: keeps everyone in line
Other Mob Role Taxonomies: other mob role taxonomies exist
Goals
- Treat everyone with kindness, consideration, and respect
- No one between code and production
- Clean Code - code expressed cleanly within the domain
- Zero Bugs!
- Deliver Working Software to Production Consistently
- Anyone can take a vacation (zero silos)
- Effective interdepartmental ownership
- Continuously develop lofty goals and practices
- Experiment Frequently with small changes
Benefits
- High Bandwidth Learning
- Quality and Technical Debt
- Group Conscientiousness
- Flow is easier in a mob vs pairing
- No more bugs!
Law of Personal Mobility
- If you are not contributing or learning, go to a different mob
There and Back Again: Our Rust Adoption Journey
Async implies IO
&
means the type that is passed into a method is immutableasync fn verify_signature(token: &Jwt)
State Machines - enabled by
enum
types having fieldsenum User { Pending { email: Email }, Active { email: Email, confirmation_timestamp: DateTime<Utc> } }
Future looking - new states can be added, and compiler will tell you when a state isn't covered
Predictable performance - rust is fast, but more importantly, its performance isn't affected by things such as garbage collectors
The Rust book is a great place to start
Rustlings
Note posted on Friday, April 30, 2021 7:00 PM CDT - link