David Vedvick

It's been coded

These are notes I took from the 2021 Open-Source North conference, which takes place annually in Minneapolis, Minnesota

Deep Learning for Natural Language Processing

  • Natural Language Understanding (NLU) is the focus of this talk

Natural Language Generation

  • Mapping from computer representation space to language space
  • Opposite direction of NLU

Deep Learning

  • Subfield of machine learning
  • Algorithms inspired by the structure and function of the brain called artificial neural networks
  • Advantage over machine learning is to extract features automatically

Text is Messy

  • Punctuation, typos, unknown words, etc.

Preprocessing Techniques

  1. Turn the text into meaningful format for analysis (tokenization)
  2. Clean the data
    • Remove: upper case letters, punctuation, numbers, stop words
    • Stemming
    • Parts of speech tagging
    • Correct misspelled words
    • Chunking (named entity recognition, compound term extraction)

Preprocessing: stemming

Stemming and Lemmatization = cut word down to base form

  • Stemming: uses rough heuristics to reduce words to base
  • Lemmatization: uses vocabulary and morphological analysis
  • Makes the meaning of run, runs, running, ran all the same

Bag of words

Way of representing text data when modeling text with machine learning or deep learning algorithms

Word embeddings

Type of word representation that allows words with similar meaning to have a similar representation

  • They are a distributed representation of text
  • Word embedding methods learn a real-valued vector representation for a predefined fixed sized vocabulary from a corpus of text


  • Problem: count vectors far too large for many documents
    • Solution: Word2Vec reduces number of dimensions (configurable, e.g. 300)
  • Problem: bag of words neglects word order
  • SkipGrams: SkipGrams is a neural network architecture that uses a word to predict the words in the surrounding context, defined by the window size
  • Continuous Bag of Words: CBOW uses the surrounding context
  • What happens? Learns words likely to appear near each word
  • Vector scan combined to create features for documents
  • Use Document Vectors for ML/DL on documents (classification, etc.)

Feature Selection

Manual process in traditional machine learning techniques, which happens automatically in deep learning

Embeddings + CNN

Using word embeddings for representing words and convolution neural network for classification task. Architecture has 3 pieces:

  1. Word embedding model: generate word vectors
  2. Convolutional model: extracts salient features from documents
  3. Full-connected model: interpretation of extracted features in terms of a predictive model

Recurrent Neural Networks

  • Networks with loops
  • Allows information to persist
  • Enables connecting previous information to present task
  • Context preserved

Vanishing Gradients with RNN's

  • In the simplest form RNN's don't work as well as wanted
  • Learning rate drops with back propagation
  • Long-Short-Term Memory units help combat the vanishing gradient problem by introducing an "error carousel".
    • Allows learning sequences, keeping track of the order without a vanishing gradient

Major challenges with DL for NLP

  • Data size: RNN/LSTM doesn't generalize well on small datasets
  • Relevant Corpus: required to create domain specific word embedding
  • Deeper Networks: empirically deeper networks have better accuracy
  • Training Time: RNN's take a long time to learn


Robot Rock - AI and Music Composition

  • Previous Examples of Music Generation:
    • Mozart?! - developed game for auto-generating music
    • ILIAC 1
    • Neural Networks

Machine Learning used for Music Generation

  • Standard feed forward networks aren't a good fit for predicting sequential events (e.g. music, text)
    • Limitation: fixed number of inputs/outputs

Recurrent Neural Networks (RNN)

  • Better for text/music
  • LSTM is key to improving results

Music Encoding Options

  • MIDI
  • Waveform

Programs that Do Music Generation

  • Amper
  • AIVA
    • Generational soundtracks for video games?!
  • LANDR - AI based mastering of music
  • Magenta (Google)
  • OpenAI: MuseNet (built off of MuseTree), JukeBox
  • PopGun
  • Live Performance toolsTidal Cycles, Orca

Artist That Use AI

  • Taryn Southern - I Am AI (2018)
  • Yacht - Chain Tripping (2019)
    • Transcribed entire backlog to MIDI to train Magenta
    • Treated ML as a collaborator
  • Holly Herndon - Proto (2019)
    • Created "Spawn", which performed music
    • She earned her PhD based on this album

Lyrics Generation

  • GPT-2 has a model to generate lyrics

Empowering Streams through KSQL

  • Querying Kafka streams through KSQL
  • Custom Data Integration is hard: ephemeral isn't useful, stateful is hard
  • Kafka: A-B integration allows loose coupling, with Kafka as the middle layer
  • Kafka can handle load with a very predictive, linearly scaling, model
  • Kafka partitions data with "Topics"

Kafka Data Transformation

  • Single Message Transforms (SMT)
    • Transformations configured via JSON
  • KStreams: advanced message transforms in Java
  • KStream - unending list of messages arriving
  • KTable - a projection of the most recent value in a KStream


Uses a SQL interface to work against KTables/KStreams

  • Emit changes keyword continuously runs query
  • Has basic querying capabilities, and other functions, that work against Kafka streams
  • ksqldb.io
  • Confluent open source
  • Confluent runs cloud native Kafka distribution

Lessons on Chaos Engineering

Chaos engineering is an experiment, building an experiment around steady-state hypothesis.

  • Not all signs are useful.
  • "The future seems implausible, the past incredible"
  • Weak signals are the signals we get before something goes wrong, and are an important insight into something before it goes wrong
  • Search for how close we are to failure
  • Past signals may not be future signals, future signals may come from areas that were not signalling before

Insights that Come From Weak Signals

  • On-Call shifts should end on Fridays!
    • Engineers are tired, and
    • On-call shifts ended on Fridays and begin for the next person Friday
  • A designated "ops-support" person
  • "I don't know anything about this, we'll need to talk to Emma.": signalling the system is approaching a boundary - what happens if Emma decides to pursue other opportunities?
  • Value proposition of chaos engineering is the insights you gain
  • Rare that a single signal is strong enough
  • Having a multi-functional product team is the best way to make products

Technical Excellence through Mob Programming

  • Retrospectives: tie together learning time
  • 1 year of no bugs! Organization chose to scale mob programming.

How to Mob Program

The Mob Programming RPG


  • Driver: drive the PC
  • Navigator: gives the directions on what to program
  • Mobber: yield to less privileged voice, contributes ideas
  • Researcher: break off on tangents to look into different ideas
  • Sponsor: speak-up for others
  • Navigator of the Navigator: navigates the navigator!
  • Automationist: sees a developer doing the same thing over and over again, might be able to automate those things
  • The Nose: calls out code smells
  • Traffic Cop: keeps everyone in line

Other Mob Role Taxonomies: other mob role taxonomies exist


  • Treat everyone with kindness, consideration, and respect
  • No one between code and production
  • Clean Code - code expressed cleanly within the domain
  • Zero Bugs!
  • Deliver Working Software to Production Consistently
  • Anyone can take a vacation (zero silos)
  • Effective interdepartmental ownership
  • Continuously develop lofty goals and practices
  • Experiment Frequently with small changes


  • High Bandwidth Learning
  • Quality and Technical Debt
  • Group Conscientiousness
  • Flow is easier in a mob vs pairing
  • No more bugs!

Law of Personal Mobility

  • If you are not contributing or learning, go to a different mob

There and Back Again: Our Rust Adoption Journey

  • Async implies IO

  • & means the type that is passed into a method is immutable

    • async fn verify_signature(token: &Jwt)
  • State Machines - enabled by enum types having fields

    enum User {
      Pending {
        email: Email
      Active {
        email: Email,
        confirmation_timestamp: DateTime<Utc>
    • Future looking - new states can be added, and compiler will tell you when a state isn't covered
  • Predictable performance - rust is fast, but more importantly, its performance isn't affected by things such as garbage collectors

  • The Rust book is a great place to start

  • Rustlings

Note posted on Thursday, May 20, 2021 3:48 PM CDT - link

The Satisfaction of UI Development

There's something oddly satisfying about UI Development. These days I spend most of my time doing back-end, server-side work, and that's highly satisfying as well... but there's something about being able to see and play with the results yourself when a change is made.

Anyways, that's of course going to lead to a screenshot of my latest update to project blue, which introduced a fancy, transparent, right-handed sliding drawer for the Now Playing list:

New Slide Out Now Playing Drawer

Note posted on Friday, September 20, 2019 11:47 PM CDT - link

A Conceptual Model of Async/Await

Microsoft released the async/await programming model to the world in 2012. While many of us diligent C# developers have been advocating the use of async/await since then, I have noticed many programmers still struggle with finding the benefits of async/await worth the cost of understanding. There's a few behaviors I notice when looking through code:

  1. Developers will often go for synchronous code over asynchronous
  2. Continued use of Task.Result to force a task into performing synchronously

These behaviors indicate (to me) that developers have a hard time seeing the benefit of asynchronous code over synchronous. And to be fair, in their day-to-day development activities, it may seem like async/await just adds another layer of cognitive overhead to an already difficult job. Throughout this post, I will try to give a simple model with which to conceptualize async/await and show some of the fun things you can do with async/await.

Most of this text is also applicable to Javascript Promises, and their async/await implementation. This is because C# Tasks and Javascript Promises are both implementations of the same construct, better covered by Wikipedia than myself: https://en.wikipedia.org/wiki/Futures_and_promises.

The Simple Model

This is all that async/await does:

  1. An await signals that you desire for your current line of code to pause execution until the task you are waiting for completes execution. Once the task completes, the code after your current line will continue execution. You can only await an object that has a GetAwaiter() method which returns an implementation of INotifyCompletion. In the wild, this is usually a Task class.
  2. An async tells the compiler that you want to use the await keyword in a function. The function must return a type Task<...>.


Now that we have a simple conceptual model for async/await, we can have a lot of fun with async/await while continuing to write code that is easy to reason with.

Imperative Code Style

Below is a simple example to illustrate the difference between using a Task with continuations and using a Task with await:

public Task Main()
    .ContinueWith(myData =>
      var myProcessor = new Processor();
      return myProcessor.Process(myData.Result);
    }, TaskCompletionOptions.OnlyOnRanToCompletion)

The above callback style model can become this:

public async Task Main()
  var myData = await GetData();

  var myProcessor = new Processor();
  await myProcessor.Process(myData);

This contrived example obviously doesn't show much, but the code does arguably become cleaner. Async/await also opens us up to using other programming constructs, such as loops, with our asynchronous code. So taking the above example, let's say the data is an IEnumerable:

public async Task Main()
  var myData = await GetData();

  var myProcessor = new Processor();

  foreach (var data in myData)
    await myProcessor.Process(data);

I'm not even going to bother to try to write this with a Task chain! It's important to note that while this code does not block while waiting for the data to process, the code still does execute in-order. Async/await enables us to write non-blocking code with our typical C# programming style, which is really neat. However, we can do some even more interesting things with Tasks.


Interleaving is one of my favorite uses of async/await. Rather than immediately pausing execution on a long-running task, we can start execution of the task, hold onto the task itself, and then start execution of other tasks in the meanwhile. When we're actually ready for the result of the first task, we can then await it. For example:

public async Task Main()
  var myDataTask = GetData(); // Hold onto a reference of the Task, rather than the result of the Task

  // Execution can still be paused in the method on the Tasks below, and the myDataTask
  // will continue execution in its own synchronization context (usually a Thread)
  await Console.WriteLineAsync("Please input some additional data while the other data loads!");
  var additionalData = await Console.ReadLineAsync();

  var myProcessor = new Processor();

  // Finally pause for `myDataTask` to complete execution
  await myProcessor.Process(await myDataTask, additionalData);

This allows gathering data from your data source while the user is keying in other data, potentially masking the time it took to retrieve the data from the data source.


Tasks also have the combinatorial helpers Task.WhenAll(Task...) and Task.WhenAny(Task...) (or Promise.all(Promise...) and Promise.race(Promise...) for those Javascript folks following along). While the applications of Task.WhenAll may seem obvious (and I will cover them soon), it is a little more difficult to find a use for Task.WhenAny. Often times, I'll use Task.WhenAny when I want to execute multiple tasks and change the control flow of my program as a result of which task completed first - timing out execution is a clear use case for this pattern.

Let's say we're waiting for our data to come in on a message bus:

public async Task<int> Main()
  var myDataTask = GetDataOffOfMessageBus();

  var raceResult = await Task.WhenAny(myDataTask, Task.Delay(TimeSpan.FromMinutes(30)));

  if (myDataTask != raceResult) // Task.WhenAny returns a `Task<Task>`, whose result is the winning task
    await Console.WriteLineAsync("What's taking that message so long?");
    return -1; // Unceremoniously stop the program

  var myProcessor = new Processor();
  await myProcessor.Process(await myDataTask); // You can also await the same Task multiple times, the result is held onto by the task once it completes, so it will always be the same **reference** or **value**

  return 0;


Once you start interleaving and racing your code, you will start having the desire to just not wait at all for anything until it's absolutely needed. Aggregation of your tasks using Task.WhenAll(Task...) can help you with this.

Let's take the above example where we iterated over an IEnumerable of data. Let's instead use the combined power of LINQ and Tasks to process all of that data all at once:

public async Task Main()
  var myData = await GetData();

  var myProcessor = new Processor();

  await Task.WhenAll(myData.Select(async data =>
    await myProcessor.Process(data)); // You could also just return the Task returned by `Process`, which might give you some speed benefits

A Cautionary Note

It's important to note that except for the first two examples, all the above examples introduce concurrency into your code. Along with concurrency come all the concerns that plague concurrent models: deadlocks, unsynchronized state, etc. That being said, it's still completely acceptable (and possible) to write non-concurrent code with async/await, such as in the first two examples, and still realize benefits. That's because every operating system comes with a maximum number of supported native threads [0][1] (which many runtimes use) and an await call frees up that thread to either be used by something else, or to be collected by the garbage collector. This is why async/await helps your application more efficiently use its hosts resources - whether it be for parallel processing or just conservative use of threads. I hope you come to enjoy the benefits of having the more responsive and conscientious applications that async/await brings you as much as I have.

[0] Increasing number of threads per process (Linux) (https://dustycodes.wordpress.com/2012/02/09/increasing-number-of-threads-per-process/)

[1] Does Windows have a limit of 2000 threads per process? https://devblogs.microsoft.com/oldnewthing/20050729-14/?p=34773)

Note posted on Sunday, March 24, 2019 10:30 AM CDT - link

Things I've Built While Building an Android Music Player

Five and a half years ago, I began building an Android application to stream music from my home server, which runs J River media center. Looking back, the journey has been as or more rewarding than the destination, as I only have about 20 active users on a given day beyond myself, but the software I've built along the way has been extremely fun to build. On the other hand, it's often been an act of frustration to build an application in Android's language of choice, as many of the tools I expected to have even back in 2012 were missing from the Java ecosystem. As a result, I built some of these tools myself.

Those tools were Lazy-J, a simple Lazy instantiation library, Artful, a library that serializes SQLite queries and results from and to Java classes, and Handoff, a promise library for easy asynchronous programming. While each of these libraries could probably have had many posts devoted to each of them, I find writing painful, so I'll devote this one blog-post to them all, and hopefully do them some of the justice they deserve.


One of the things I missed when I was working in Java was the lack of lazy instantiation in the standard library. While there's all sorts of recommendations on how to do lazy instantiation, the approaches usually only apply to static methods, and rely on intimate knowledge of the runtime and Java spec to properly expect lazy operation. I needed a way to do a private final Lazy<MyObject> lazyObject = new Lazy<MyObject>(), because I often needed to use a shared resource in my Android view objects without knowing when they'd be created.

I've seen IOC frameworks such as Dagger do this, but my application was never complicated enough to warrant using an IOC framework. I also confess to not being a huge fan of Java IOC frameworks due to their dependence on attributes (although it's understandable given the language's limitations).

Usage of Lazy-J is pretty straight-forward - a simple usage, just new-ing up an object, looks like this:

class MyClass {
  private readonly Lazy<Object> lazyObject = new Lazy<Object>(Object::new);

I've abused Lazy-J excessively, for example, I don't want to have to run findView all the time to get a view, so I have a class titled LazyViewFinder, which will hold on to the reference to the view the first time you use it. This class lets me hold a reference to the view in my class, so I can reference my views like this:

public class Activity {
  private LazyViewFinder<TextView> lblConnectionStatus = new LazyViewFinder<>(this, R.id.lblConnectionStatus);

  protected void handleConnectionStatusChange(int status) {

    final TextView lblConnectionStatusView = lblConnectionStatus.findView();
        switch (status) {
        case BuildingSessionConnectionStatus.GettingLibrary:

You can find lazy-j at https://github.com/danrien/lazy-j.


Artful is a library which also began as a source of frustration: at the time (and even today?) there are not many easy, transferable ways to easily map the results of a SQL query on the built-in SQLite database to a Java object.

In C#, using the excellent Dapper library, one can just do this:

using (var connection = new SqlConnection())
  return await conn.QueryAsync<MyObject>("select * from my_object");

Perfectly bridging the gap between SQL Queries and a strongly-typed languague, while still allowing each language to flex its respective strengths (small note: I think future interop efforts between languages should focus on these types of bridges).

Due to type erasure, we can't do quite the same thing in Java, but with Artful, I managed to get pretty close:

public Collection<Library> getLibraries(Context context) {
  RepositoryAccessHelper repositoryAccessHelper = new RepositoryAccessHelper(context);
  try {
              .mapSql("SELECT * FROM library")

Artful is new'd up inside of RepositoryAccessHelper, which does this:

public Artful mapSql(String sqlQuery) {
    return new Artful(sqliteDb.getObject(), sqlQuery);

Since Artful caches the SQL queries and the reflection, it becomes pretty performant after the first round of serialization on a given class. It is by no means feature-complete, for example it can't serialize directly to primitive types - repositoryAccessHelper.mapSql("SELECT COUNT(*) FROM library").fetchFirst(Long.class) would likely give you nothing - but it has served me remarkably well, without memory leaks as well.

You can find Artful at https://github.com/namehillsoftware/artful.


Asynchronous programming in Java is really painful. Java has Future, which purports to be the way to perform asynchronous behavior, but it merely introduces a spin-lock to achieve receiving a value synchronously on your calling thread. While this may be fine behavior for a server application (although many a NodeJS and nginx server may disagree with you on the principle of thread starvation), for a desktop application, knowingly introducing blocking behavior into your application is border-line offensive.

Android attempted to make this better with AsyncTask, but I found it to be nearly as painful to work with as Future:

  1. It's difficult to easily chain one asynchronous action to another without developing verbose APIs
  2. It has messy internal state that leaks into your consumption of the library with unchecked exceptions
  3. Results post back to the UI thread that it was called on, which is often all that is wanted in simple applications, but the second you have to perform additional asynchronous work, starts to cause troubles

Handoff is the last library I've developed for my little music player, and the one of which I am proudest. Handoff aims to be a Promises A+ like promise library for Java. While Java has CompletableFuture, I find its API surface layer to be rather large. In this instance, I also found C#'s Task library to be very verbose - they had to introduce an entirely new language primitive just to make it easier to work with (async/await)! I even speculate that the Task library was explicitly written to set the stage for async/await ;).

I really like the ergonomics of Javascript's Promise class, and thought it would be fun to see if I could make something like that for Java. It has both been fun to develop and tremendously beneficial! I wish it was easy to show a side-by-side comparison of my app's responsiveness before and after using Handoff, but the difference has been night and day for me - ever since switching to a promise-like async model, I rarely get UI hangs or unresposive warnings from the OS, and unexpected IllegalStateException hangs have gone completely away.

Handoff in the simple case is used like this:

new Promise<>(messenger -> {
    final IPositionedFileQueueProvider queueProvider = positionedFileQueueProviders.get(nowPlaying.isRepeating);
    try {
        final PreparedPlayableFileQueue preparedPlaybackQueue = preparedPlaybackQueueResourceManagement.initializePreparedPlaybackQueue(queueProvider.provideQueue(playlist, playlistPosition));
        startPlayback(preparedPlaybackQueue, filePosition)
            .firstElement() // Easily move from one asynchronous library (RxJava) to Handoff
        playbackFile -> messenger.sendResolution(playbackFile.asPositionedFile()), // Resolve
        messenger::sendRejection); // Reject
    } catch (Exception e) {

The returned promise can then be chained as you'd expect:

  .then(f -> { // Perform another action immediately with the result - this continues on the same thread the result was returned on
    // perform action
    return f; // return a new type if wanted, return null to represent Void
  .eventually(f -> { // Handoff the result to a method that is expected to produce a new promise
    return new Promise<>(m -> {

  .excuse(e -> { // Do something with an error, errors fall through from the top, like with try/catch
    return e;

Handoff can be found here - https://github.com/danrien/handoff.

And the application that motivated me to build all of these little libraries is Music Canoe, which can be found here - https://github.com/danrien/projectBlue.

Note posted on Sunday, April 8, 2018 12:02 PM CDT - link

Concurrency vs. Parallelism: A breakfast example

One of the harder problems in Computer Science is concurrency. This summer at my employer, I gave a presentation on asynchrony, which I consider nearly the same thing as concurrency. One of the things that I felt was lacking was my explanation of concurrency vs. parallelism. I ended up just giving a textbook explanation:

  • Parallel processing is taking advantage of a lot of processors (local or remote) to run calculations on large volumes of data
  • Asynchronous execution is freeing up the processor to do other things while a lengthy operation is occurring

This morning, however, I was making myself breakfast, and I thought up a useful analogy.

A breakfast example

My breakfast this morning consisted of a coffee and two slices of toast with peanut butter.

A perfect example of concurrency was how I made the breakfast: I first started the toast, then put the coffee cup in the Keurig and pushed the brew button on the Keurig. This is a concurrent operation - one job (toasting the bread) was started, and after that began, I (the processor) was freed up to start another job, the "brew coffee" operation.

We can take this analogy further: the toaster can actually process two pieces of bread at once, which is a parallel operation. From here, we can easily see that parallelism is a subset of concurrency: technically, the toaster is technically performing two operations concurrently, what makes it a parallel operation is the fact that it's the same process occurring twice, started at the same time within the same machine.

Should we write this as C#?

public static class Program {
  public static void Main() {

  public static async Task MakeBreakfast() {
    // Start toast - this operation takes the longest to complete, so let's get
    // it started as soon as possible
    var toaster = new StandardToaster(new ElectricitySupplier());
    var toastingTask = toaster.Toast(new WheatBread(), new WheatBread());

    // Now start the Keurig, a relatively short operation
    var brewer = new Keurig(new Water());
    var brewingTask = brewer.Brew(new DarkCoffee());

    // Don't return control to the human until operations complete
    await Task.WhenAll(toastingTask, brewingTask);

  public interface Toaster {
    public Task<IEnumerable<Toast>> Toast(Bread bread, Bread bread);

  public interface Brewer {
    public Task<Coffee> Brew(GroundCoffee groundCoffee);

Concurrency and Parallelism in Real Life

The thing about concurrency and parallelism is that you can do this all the time; for example, humans are terrible at multi-tasking (parallel processing), but are great at starting multiple jobs, and then taking action when they finish (concurrent processing).

I encourage everyone to always think of how the things they do in real life apply to different concepts in software development. Since software development is all about automating real life processes, these analogies actually occur much more frequently than one would expect!

Note posted on Sunday, December 3, 2017 11:45 AM CST - link

What is Software Engineering and are we Software Engineers?

In our day jobs, we often call ourselves many things (or our HR departments call us these terms for us):

  • Software Developer
  • Programmer
  • Software Designer
  • Software Engineer
  • Technologist (what does this even mean?)

I think most of us prefer the term "Software Engineer" at the end of the day; it tends to properly transmit the problem-solving needs, due diligence, and rigor required in our job, even if we don't apply those three traits all the time. We even may work with people ensuring the validity (and thus quality) of our software, who hold the title of "Software Quality Engineer".

But what really defines a Software Engineer? It's not enough to just have a title that gives a good feeling of the difficulties our job entails - is it? Well this question is answered easily enough - a Software Engineer is someone who practices Software Engineering!

But then the next practical question becomes... what is Software Engineering? Wikipedia gives Software Engineering five possible definitions, which I'll repeat here for convenience:

  • "research, design, develop, and test operating systems-level software, compilers, and network distribution software for medical, industrial, military, communications, aerospace, business, scientific, and general computing applications."

  • "the systematic application of scientific and technological knowledge, methods, and experience to the design, implementation, testing, and documentation of software";

  • "the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software";

  • "an engineering discipline that is concerned with all aspects of software production";

  • "the establishment and use of sound engineering principles in order to economically obtain software that is reliable and works efficiently on real machines."

Ok, so many of these definitions look somewhat flimsy on the face of it; perhaps for proper definition, one should look at what gave birth to other engineering disciplines.

The Birth of an Engineering Discipline

Mary Shaw from Carnegie Mellon University in a keynote address at the International Conference on Software Engineering, determined that an engineering discipline emerges following these rough steps:

From craft practice to engineering discipline A diagram detailing the process a craft practice goes through to become an engineering discipline

  1. People begin using a new discovery or tool to solve some problems in simpler, newer, faster ways; this is the emergence of a craft practice around this discovery - say the *"The emerging field of "* - or The emerging field of Computer Science in the late 1950s through the 1980s.
  2. Some of these people start businesses with their new, disruptive products, others are hired to disrupt existing businesses
  3. The businesses begin running into problems with the new products, eventually practitioners of the craft in the field cannot solve problems of ever increasing complexity alone, spurring the need for research into how solve these problems
  4. Scientists develop new practices and methodologies, make new discoveries in the field to solve problems; in order to properly disseminate these findings, the scientists use all tools available: documentation, training, etc.
  5. The findings of these discoveries eventually coalesce into a discipline; finally we have engineering!

Software development is far along the path of becoming an engineering practice; people use the computer sciences to solve common problems in business, government, and medical fields. As problems are found with current patterns and practices, subsequent solutions are found, and disbursed through many avenues (a common one in our field is Stack Overflow!).

Some tools and processes seem to introduce sea-changes in producing reliable code. The SOLID approach to object-oriented software design was one. From this, a whole literature has stemmed to produce consistently SOLID software designs. Peer review and test-driven development also seem like step-wise improvements in producing reliable software (by stressing as many variations of state that take place in a finite state machine as possible at time of development, a developer takes the step of not only ensuring correct operation of the code at the time of development, but also correct forward operation of the code in the future).

However, we aren't quite at the point where the science and tooling and practice sides of the equation have caught up to produce highly reliable code and solve novel problems at a high frequency.

If Building Software is not yet an Engineering Practice, are we Engineers?

Now comes the chicken and egg question: does an engineering practice make an engineer, or does an engineer make an engineering practice?

Instead of using the above process that Mary Shaw went through to define an engineering practice, let's look at what defines an engineer. From there we can maybe answer the question of what an engineer is without answering the question of what an engineering practice is.

Let's think of the situation of an electrical engineer and a certified electrician: both are capable of designing operating electrical circuits. Both are knowledgeable in the real-world limits and dangers of electrical equipment and components. An electrician can likely solder components onto a board just as quickly and deftly as an electrical engineer. In other words, they're both capable of understanding and applying circuit theory.

Where do they differ? What makes an electrical engineer's degree and certification harder to achieve? What do electrical engineers bring to the table that electricians do not? Perhaps the engineer solves novel electrical problems, but I think that an electrician is also capable of that when working within his own knowledge and what he has learned beyond that. Perhaps the engineer is tasked with staying at the forefront of his field, but a good electrician should also stay current with the field (and may be required to as well by government).

It seems more correct to say that electrical engineers (are supposed to) have the ability to contribute back to the fields of electrical engineering when a novel problem requires a novel solution outside of the bounds of the existing body of knowledge of the electrical engineering field. So maybe it is sufficient to say that an Engineer has enough mastery of the field they work in that they can contribute back to their field with novel solutions from outside of the discipline as it exists in that moment.

So are software engineers, you know, engineers? In most engineering disciplines, proper testing and certification is required by state and national boards in order to properly claim that one is an engineer. This type of certification does not exist yet. IEEE offered a Certified Software Development Professional program at one time, but that was discontinued in 2014. Instead, they now offer certifications in multiple areas of software development, with the reasoning seeming to be that software development covers too broad of a spectrum of software creation at the moment to be grouped into one certification.

So at present, it doesn't seem that there are any widely recognized certifications that provide a definitive "software engineer" title. However, that doesn't mean that there are not many of us today who are in effect practicing the same disciplines as other engineers; it just may be that there is not yet enough agreed upon material for there to be a known, written determination of what makes the software engineering discipline.

So what?

I do predict that one day - perhaps 1500 years from now, but hopefully not that long - the title of Software Engineer will be a professional distinction that will require full testing and certification. At the end of the day, does any of this matter? If the rest of the industry is following the title of "Software Engineer", then there doesn't seem to be any good reason to be apprehensive to the usage of the title of Software Engineer. However, I think after taking all of the above into account, we should feel encouraged and motivated to continue growing our practice, and contributing as much as we can to the development of software engineering as a discipline!

Note posted on Thursday, May 19, 2016 12:31 AM CDT - link

lazy-j: Lazy Java initialization library

Coming from the C# world, while working on audiocanoe I've often had the overwhelming desire to use something similar to the Lazy class that is in the Standard libs for .Net. Using it, you can easily initialize any object lazily without needing to implement your own double-check locked lazy initialization code.

So in a hasty moment, I wrote a library called lazy-j which supposedly guarantees your object will lazily be created the first time it is requested, using the supplied initialization function. It should also be thread-safe. It is also EXCEEDINGLY simple, here's the source:

package com.vedsoft.lazyj;

 * Created by david on 11/28/15.
public abstract class Lazy<T> {

    private T object;

    public boolean isInitialized() {
        return object != null;

    public T getObject() {
        return isInitialized() ? object : getValueSynchronized();

    private synchronized T getValueSynchronized() {
        if (!isInitialized())
            object = initialize();

        return object;

    protected abstract T initialize();

There's some nice things here; it uses Java's built in synchronized methods to do a double-checked lock for object initialization. It doesn't have all the niceties of Microsoft's library (such as different degrees of thread-safety), but it gets the job done nicely while being simple enough to understand at a glance.

Usage is also fairly simple. To instantiate a new object do something like below:

class MyClass {

    public static Lazy<MyCrazySingletonConfig> myCrazySingletonConfig = new Lazy<MyCrazySingletonConfig>() {
        protected MyCrazySingletonConfig initialize() {
            final MyCrazySingletonConfig newConfig = .....

            return newConfig;

class SomeOtherClassThatNeedsConfig {

    public void doingThingsWithConfig() {
        final String property = MyClass.myCrazySingletonConfig.getObject().getMyCrazyProperty();

You can view the source here!

Note posted on Tuesday, January 5, 2016 11:04 PM CST - link

Sync your media to your phone with Audiocanoe

Sometimes, Santa wants to listen to his music in his sleigh but his connection is spotty

What can Santa do? He can sync his music to his phone with the new version of audiocanoe that is in testing! Following the gif below, Santa can easily sync his favorite playlists and within minutes have them on his phone for playback:

Syncing Playlist

To grab it, head over to the beta test site and opt-in to help test it out!

Happy holidays!

Note posted on Thursday, December 24, 2015 1:51 PM CST - link

New Materialish Look for audiocanoe

audiocanoe is seeing some updates coming up to match Google's new Material Design specs! Take a look below:

Browsing Library

Now Playing

Note posted on Tuesday, October 13, 2015 7:20 AM CDT - link

Use Git to Manage Your Blog History!

One of the major problems of rolling your own weblog is properly managing the history of your posts.

The aim of this post is to elucidate how one can easily manage blog history, using only Git.


The best known methods for managing history of text documents have always been terrible. Yes, I'm speaking of Wordpress, but also commercial solutions like SharePoint, or the version tracking that has been built into Microsoft Word for the longest time.

Here's a list of cons that I always think of when using these tools:

  1. Inconsistently track history
  2. Sometimes can leave comments, sometimes can't
  3. Usually diffing is either unavailable or is built using some proprietary/internal code that probably doesn't work well
  4. Obfuscated via dense database models, XML formats, and/or binary formats
  5. Difficult to use third party tools with them
  6. Content management systems, which is what all blog engines are, need security to manage the blog. These security systems usually come riddled with bugs and security flaws.

Along comes lowly git, the little DCVS tool that could, which fills in the above gaps nicely. Combine this with a nice text format such as markdown, and you've got yourself a nice, versioned, document management system.

However, it does come with its own set of cons:

  1. The git learning curve
  2. Git doesn't natively take post metadata
  3. Git is a version control system, and thus doesn't track file metadata either — so "true" file creation time, last modified time are not available
  4. Wrapping git commands up in your favorite server-side language can sometimes be tricky
  5. Versioning doesn't happen automatically, but rather on intentional commits

None of this is a show-stopper however. Yes, git is ridiculous to learn. Yes, you can't get "true" file creation time. But none of this certainly bothered me much.


This is how I did it with nodejs:

  1. Create a git repo (git init) where you want your posts to reside.

  2. Use a nice sane format to store metadata about your posts. I'd personally go with at least a JSON-like format. Mine looks like below:

     title: Use Git to Manage Your Blog History
     author: vedvick

    The --- signals to the parser that the metadata section is complete.

  3. Grab the posts from a configured or constant location. This is my highly sophisticated version:

     glob(path.join(notesConfig.path, '*.md'), function (err, files) { ... });

    Following a simple convention of prefixing filenames with the date the post is created, such as 20151006-use-git-to-manage-your-blog-history.md, the server can then easily and reproducibly sort the files by the created date.

  4. Parsing the notes has a little sophistication to it. Here's the code used on my server in full:

     var parseNote = function (file, callback) {
         parseNote.propMatch = /(^[a-zA-Z_]*)\:(.*)/;
         fs.readFile(file, 'utf8', function (err, data) {
             if (err) {
             var textLines = data.split('\n');
             var fileName = path.basename(file, '.md');
             var newNote = {
                 created: null,
                 pathYear: fileName.substring(0, 4),
                 pathMonth: fileName.substring(4, 6),
                 pathDay: fileName.substring(6, 8),
                 pathTitle: fileName.substring(9)
             var lineNumber = 0;
             for (var i = lineNumber; i < textLines.length; i++) {
                 lineNumber = i;
                 var line = textLines[i];
                 if (line.trim() === '---') break;
                 var matches = parseNote.propMatch.exec(line);
                 if (!matches) continue;
                 var propName = matches[1];
                 var value = matches[2].trim();
                 switch (propName) {
                     case 'created_gmt':
                         newNote.created = new Date(value);
                     case 'title':
                         newNote.title = value;
             newNote.text = textLines
                                 .slice(lineNumber + 1)
                                 // add back in the line returns
             if (newNote.created !== null) {
                 callback(null, newNote);
             if (!notesConfig.gitPath) {
                 newNote.created = new Date(newNote.pathYear, newNote.pathMonth, newNote.pathDay);
                 callback(null, newNote);
             exec('git -C "' + notesConfig.gitPath + '" log HEAD --format=%cD -- "' + file.replace(notesConfig.path + '/', '') + '" | tail -1',
                 function (error, stdout, stderr) {
                     if (error !== null) {
                     newNote.created = new Date(stdout);
                     callback(null, newNote);

    The neatest part here (and where git or some other version control system shines) is using it to determine the note's created date:

     exec('git -C "' + notesConfig.gitPath + '" log HEAD --format=%cD -- "' + file.replace(notesConfig.path + '/', '') + '" | tail -1',
          function (error, stdout, stderr) {
              if (error !== null) {
              newNote.created = new Date(stdout);
              callback(null, newNote);

    Note how this doesn't actually return the true "created" timestamp of the file, but it does, in my opinion, return a timestamp that is close enough.

  5. When drafting a new post, create a new branch so the draft can be worked on in isolation without affecting work on other posts (for example, I posted an entirely different post while drafting this one). For this I also follow another convention: post/<post-name-here>. Of course, the convention is optional but I think at the very least it encourages consistency.

  6. Finally, merge posts into master and push it to a web server. Then add a post-receive hook that checks out master to the location determined above: GIT_WORK_TREE=<note-location> git checkout -f

Note posted on Tuesday, October 6, 2015 7:21 AM CDT - link