Introduction to Helm Charts (A Market Analysis)

Brian Overland, April 2020

The Purpose of Helm and Related Software

Deployment is the installation of new software, along with updates. On large projects, this is a time-consuming activity, and sometimes an error-prone and expensive one. Problems stem from:

1) The number of pieces that typically have to be installed,

2) The assumptions made on the development side, because developers don’t see the headaches that operators deal with,

3) The need to support multiple operating systems and different processors,

4) The need to make sure that all the pieces are in sync with each update. With periodic updates, there’s possibility for errors because of lack of version control.

The Helm approach, along with the underlying support of Kubernetes and Docker, provides a better way to distribute, install, and update software. The software deployed runs in a nearly identical way on every operating system. Software is placed in a series of “containers”: these are like tiny computers within the computer, in which every aspect of the configuration is controlled. And this is done much more efficiently than using virtual machines. The reason for the efficiency is that a container uses the existing operating system, rather than containing its own virtual operating system.

Helm puts the installation tasks under the control of a process centered in one human-readable file: the Helm chart. The chart acts as a kind of manager of the deployment, and it sets reasonable defaults. Yet it also enables operators to adjust parameters, so flexibility is preserved.

With a couple of commands, Helm can be used to remove one version of a deployment and build (“spin up”) another. Helm, with its version control, deploys reliably and consistently on any system that has the underlying Kubernetes and Docker infrastructure installed. That infrastructure is installed easily on any recent operating system.

Docker, Kubernetes, and Helm have great support in the growing DevOps community, and that support is getting better every day, so any investments in this technology will be rewarded. To be tech-savvy in the future, it’s becoming important to adopt this tech or fall behind.

Each business is going to need to have at least one person in their organization who understands this stack of deployment software, but once it’s understood, it can be used to deploy software with any number of updates, with consistency, speed, and version control.

How This Fits into the LumenVox Business Model

At LumenVox, we provide supporting technology for speech recognition and related technology. A major focus of the company has been to provide best possible customer experience; for this reason, we’ve created tools to automate as much of the installation process as possible.

Moving to the Helm/Kubernetes/Docker model will enable us to stay ahead in these areas. Once the technology is in place, deployment is reduced from a complex task into launching a command or two. And because each update is managed by a Helm chart, it avoids the problems caused by lack of version control.

Software maintenance costs are reduced, potentially a great deal. Problems are avoided, such as the perennial problem of coordinating all the updates…. “Do I have a piece of software from version 2.0 combined with configuration changes for version 3.0?”

Conclusion

The general analogy for these layers is a nautical one. To complete the story, think of the challenges of wooden sailing ships in the 18th Century. When maps or navigational aids were less than perfect (which they frequently were), ships would run aground, and entire shipments were lost at sea along with the crew.

But when navigational tools got better, the British sailing ships ruled the waves, getting to their commercial ports faster and more consistently, and not crashing into rocks and shoals. Kubernetes is from a Greek word for “navigate”; Helm likewise takes its imagery from the idea of a navigational chart. With these aids, the ship of commerce comes safely into port.

Introduction to Helm Charts (Brief Technical Overview)

Brian Overland, April 2022

Helm is a technology that uses certain text documents—charts and YAML files—to automate and control the process of software deployment on a variety of platforms. It does so in a way that helps to ensure the software runs reliably, in the same way, on every platform, making it more consistent. Experience has shown that it reduces time required for updates by at least 66%.

The big picture involves a few basic steps. To install LumenVox software using this technology, here is what must be assumed:

1. The customer needs to have the Helm, Kubernetes, and Docker infrastructure installed. All these are relatively inexpensive (and in fact free to start with), and they have simple install programs themselves.

2. This infrastructure is easily installed on recent versions of all the major operating systems. The underlying concept is: software running the same way on all platforms, solving headaches for developer and operators alike. This is why we use the term “DevOps.” It’s no longer a developer working in isolation.

3. The LumenVox customer needs to have certain other pieces of infrastructure installed as well: Redis, RabbitMQ, Postgres, MongoDB, Persistent Volume. This is necessary so the containers (which are like virtual machines but more efficient) don’t contain the management systems themselves, which would bloat the size of the containers.

4. With this infrastructure installed, Helm and related technologies (Kubernetes, Docker) will generate the specific databases it needs. Helm, in effect, issues the orders.

5. For each deployment we sell, customers get a license, as usual, as well as a download of the software they’ve licensed.

Once all these pieces are installed… and that should not be difficult… Helm does the rest. It uses files called “charts” to respond to as little as one command-line instruction, and then build (“spin up”) all the right versions of the containers, under the required configuration. The process is fast and yet flexible, responding as needed to values (parameters) and placeholders.

How does it work? Let’s examine the file structure that Helm assumes is in place.

chart.yaml # A YAML file containing information about the configuration

LICENSE # OPTIONAL: A plain text file containing the license for the chart.

REAME.md # OPTIONAL: A human-readable README file

values.yaml # The default configuration values for this chart

values.schema.json # OPTIONAL: A JSON schema for imposing a structure on the vaues.yaml file.

The file structure also contains the following subdirectories: charts/, crd/, and templates/ used by Helm itself and reserved for its use.

You’ll note that YAML files are a key part of this file structure. But what are YAML files?

“YAML” stands for Yet Another Markup Language. Admittedly, this name is not very informative. But YAML is just a way of specifying objects hierarchically. It’s similar in nature to JSON (Java Structured Object Notation). In the case of Helm, such “objects” describe all the parameters needed by Helm to know what to configure and build.

Within the file structure, values.yaml is what will most often be tinkered with to fine-tune or modify a deployment. The parameter values listed in this YAML file adjust settings as needed in the chart.yaml file. This enables operators to make any tweaks necessary.

Helm Charts can even incorporate placeholders in these files, which look like this:

{{ version_num }}

Values for placeholders, in turn, can be filled in on the Helm command line, so that every time you run Helm to do a deployment… or to unwind an existing deployment…. values can be specified, customizing action as appropriate.

Within a YAML file, you can also use placeholders to refer to information. For example:

{{ .Files.Get "FILENAME" }}

The following figure summarizes how the parts of the system fit together and sit on top of the infrastructure. The same deployments can run on any supported operating system but must also have the other infrastructure shown.

Highlights of the new engine

Here are the highlights of the features of the new LumenVox ASR engine. Subsequent sections drill down into these features, giving more details:

End-to-end acoustical modeling with Convolutional Neural Networks (CNN). This extends the benefits of neural networks from the sound-recognition stage all the way to the production of text, removing the need for separate speech engines to accommodate dialects and accents.
Transfer-learning techniques to improve learning efficiency.
A streaming model that operates in an online matter, thereby boosting responsiveness and performances.
Performance and accuracy aided by statistical n-gram models.
Performance and accuracy aided by traditional SRGS grammars.
Use of quantized sort-tree n-gram Statistical Language Models (SLM), efficiently compressed in storage, enabling higher performance in less memory.

End-to-End Acoustical Modeling with Convolutional Neural Networks (CNN)

Convolutional Neural Networks (CNN) are a more advanced version of the neural-net and DNN technologies. A CNN approach develops patterns in successive layers, each building upon the layer before it.

In the visual-pattern recognition world, CNN starts by recognizing dots and small areas on the screen. Then it builds on these items to “see” lines and curves. Finally, it builds on lines and curves to see shapes.

In the speech recognition world, CNN does something similar: it hears and recognizes small bits of sound; then it builds on sounds to recognize phonemes. Finally, it builds on phonemes to recognize words and phrases.

In contrast, the legacy model does the following:

The ASR produces a series of phonemes (specific pronunciations).
A language model is then used to make sense of these phonemes and build them into words.

The legacy approach may work until the engine encounters a new dialect or a new accent… or sometimes just idiosyncrasies in the way a person talks. At that point, it fails. Under the old model, every time the dialect or the accent changes, a new speech model is required. That can stop customer service in its tracks.

With the new speech engine, the neural-net technology is end-to-end because it uses the neural net to fold in the problem of dealing with different pronunciations. In other words, we let the most expert system—the neural network itself—handle the problem. In doing so, the engine takes advantage of machine-learning technology to solve the problem of variations in speech.

Transfer Learning Techniques

Transfer learning is a machine learning technique where once a model has been trained for a specific task, it is then reused to initialize a model for a different task. This is a very common approach for deep learning where pre-trained models are being used as the starting point on many fields.

In our case this means that we can reuse models trained on thousands of hours of data, an amount only available for a small fraction of languages (e.g., English), to initialize the training of models for languages where only a reduced amount of data is available (e.g., Italian).

This results in much better accuracy and performance than when training such a model from scratch using only target language data. Furthermore, transfer learning speeds up the training process considerably (days versus months). This lowers the cost of training models for new languages.

Statistical n-gram Language Models

An “n-gram” is a sequence of n words that is statistically likely. For example, “to beak” is an unlikely bi-gram (group of 2) but “to be” is likely a bi-gram; we should expect the words “to be” to appear in that order but not “to beak.”

The advantage of this technology is that it aids in determining what words were probably said, thereby making the engine less likely to make a mistake. For example, let’s say the speaker said something like this:

I’d like to book a fly please.

Using n-gram analysis, the software can determine that it’s much more likely that the speaker meant the following sentence, but the words were not heard perfectly. “Book a flight” is a common tri-gram.

I’d like to book a flight please.

By understanding what combinations of words are more likely to appear together, the speech engine can more accurately determine what the speaker meant to say.

Quantization of Sorted-Tree n-gram Speech Language Models (SLM)

With the technology used by the new engine, language models n-grams (common sequences of words) are stored using sorted trees—a data structure that can be searched with great efficiency. The new engine also uses a compression scheme that allows larger and deeper language models more of these n-grams to be stored in a smaller space. More detailed language models are therefore represented in a smaller space and are more efficient, doing more with less memory.

Use of Traditional SRGS Grammars

SRGS grammars use BNF or XML style to model patterns of words that are expected to appear together[SH3] and in what sequence. The new engine uses these grammars as another way to predict what patterns of words are more likely. This enables more accurate processing of sounds into text.

Streaming Models

The new engine makes direct use of speech input in real-time[SH4] . Incoming sounds are processed as received, and knowledge picked up during one pass is utilized right away. As a result, there are no multiple passes or extra-cycles of processing. Consequently, there is no latency or perceptible delay between the caller speaking and the system responding. The result is a system that is faster and more responsive and makes for happier callers.