It is my hope that what you have learned in this book so far will come of great use to you as you grow technically and pursue a career in software engineering. You’ve learned about expressions and variable assignments, data types, basic program control flow, functions, classes, and several programming paradigms. These concepts are all crucial in their relevance to the practical application of programming. But what is software engineering and how is it different from programming?
Software engineering is the practice of stringently applying quantitative and disciplinary principles and standards to the process of programming. This becomes exponentially important as the programs that you write increase in scale, complexity, and impact and collaboration becomes mandatory. In such scenarios, it is not enough to simply program or “code,” instead you must engineer. This is an important distinction between a software engineer and a programmer in terms of career title.
Consider, for example, that you have been charged with the implementation of a system that could solve the problem of moving people between floors in a very large building. In this theoretical scenario, let’s pretend that the elevator has not yet been invented, but you decide to take a shot at a similar solution. At first glance, an automated pulley system seems to be the obvious solution. Fulfilling that request seems fairly simple, it requires only a passenger platform, an electric motor, a rope, and a pulley at the top and bottom of the potential floors that you want to move the platform between. So, you create this simple system, and you run it a few times to make sure it works and then open it up for others to use. It works beautifully and provides an amazing amount of efficiency for the building and its few tenants. However, tenants of other buildings start to obtain word of these efficiency gains and move into your building hoping to realize the same gains in their day-to-day operations. Suddenly, your elevator is being used by more and more passengers every day. It’s also starting to take on more and more concurrent passengers as the building fills up with tenants and people begin to get impatient. In this theoretical scenario, it is easy to see in hindsight that the rope of the elevator is eventually going to break as it wears down and too many passengers crowd your platform. If such an event were to occur, it is likely that a lot of passengers would be injured.
It is fairly obvious to see that projects that fall in the top right quadrant require the strictest engineering practices. Examples of such projects might include creating the software that powers self-driving cars, the software behind robotic surgical equipment, the coordination software for air traffic control at airports, or the software that powers millions of dollars worth of trades on a stock exchange every day. What is less obvious is what the other three quadrants might contain. Some of the risks you are evaluating might have varying levels of importance relative to your perspective. Software that produces optimal driving routes to deliver pizza might have a high likelihood of producing less than optimal routes depending on traffic or weather conditions. However, the impact of these sub-optimal routes is low. No lives are at stake and only a minimal amount of transaction cost per delivery is on the line. But, if you have created a company whose sole business is to sell this routing service to pizza companies, you might not have a business if you continue to produce sub-optimal routes. So, do not take the evaluation of software risks lightly. Consider all possibilities and mitigate any significant risks with the following engineering strategies.
Efficiency and Optimization
The first engineering strategy to minimize risk in your software is to ensure that your programs are optimized for efficient performance. As your programs scale in size, complexity, and usage, how will your program respond? Determining how your program responds to scale is important in understanding its operating cost. Having the ability to measure the performance or the cost of a piece of software is often key in determining whether a business can be profitable or even possible. For example, if your software analyzes customer feedback to determine the sentiment of customer interactions with your product, is it possible to analyze every piece of feedback? How many computers or servers must your software run on in order to analyze each response? If your business grows and you continue to get more responses, will you have to buy more server space to run your software? If you do buy more computers, how much will it cost you each year to pay for the salaries and equipment to maintain these machines?
A procedure that prints each item from a list to the terminal
A procedure with exponential growth in algorithmic complexity
Understanding how to measure the algorithmic complexity of your programs is the first step in optimizing them. Of the algorithms depicted in this figure, algorithms with O(1) complexity are the fastest and O(n!) algorithms are the slowest. In the coming chapters, you are going to see examples of data structures and algorithms that you might end up using in many of your programs. These will be demonstrated along with their Big O notation that you should seek to memorize in order to be best equipped to optimize your code and reduce the risk of creating slow or cost-prohibitive programs.
Exercise 11-1
- 1.
println(items(1))
- 2.
val longFruit = items.filter(fruit => fruit == "bananas")
- 3.
items.map(fruit => fruit.toUpperCase()).find(fruit => fruit.length == 6)
Testing
The next engineering strategy to apply to your projects is meticulous testing through automated test scripts. Automated testing can help you determine if there are bugs in your code by ensuring that you not only know the outcome of the “golden path,” or the expected normal path of your program, but also the outcome of some of the unexpected edge cases. For example, what happens when you don’t provide your program with any data? What happens if you overload it with data? What happens if you give it unrecognizable special characters? In addition to the benefits of this exhaustive edge case exploration, testing can help you identify quickly if making changes to one part of your code breaks any other part of your code. By maintaining thorough test coverage, you can feel confident that when you ship your large-scale project, all of it will work as expected.
There are a couple of common types of tests typically used within the software engineering industry. The first is what is known as a unit test. A unit test is a small test script that tests the outcome of one function (or one small block of code) within your program. A unit test relies on the code being tested to have no external dependencies, like network or database connections or calls to other functions. Unit tests work really well with pure functions since giving it a certain input should always yield the same output. The other type of common test is known as an integration test. Integration tests typically aim to determine how different parts of your program interact with one another. For example, by making a call to one function, how does that impact the state of your entire application? These are typically much more complicated to write and maintain over time and are usually much less performant.
From your terminal, type sbt sbtVersion to verify that you have SBT installed correctly. If it returns a version number, then it has been successfully added to your computer. If you receive an error, go to the SBT web site and follow the installation instructions for your operating system. You may prefer to download SBT outside of the VS Code Extension Marketplace if you are having trouble. If you are certain that you have installed SBT correctly but you are still getting errors from the command line, you may need to add the sbt command to your environment variables (Windows) or PATH (Mac or Linux). Refer to the “Installing Everything You Need” chapter for more instructions on how to accomplish that.
Once you have the SBT plugin installed, you can open up a terminal, ensure that you are in your project directory, and then type sbt to initialize the Simple Build Tool’s interactive shell (which is not unlike our Nebula OS shell). From this shell, several built-in SBT commands are available. One of which is a simple compile command that can replace the scalac command when inside the SBT shell (also known as a command runner).The two commands we will be using for our unit testing are the test command and the testQuick command. The first command looks through all of our project files and finds any unit tests we have written and executes them one time, printing the results of the tests to the screen. The second command does the same thing but in “watch” mode, meaning it executes the test and then listens for any changes to your files. In the event that you save one of your files, it immediately executes all of your tests again and continues in a loop until we tell it to stop.
Tip
Have you gotten tired of compiling all of your changes after each save of your program’s files? SBT provides you with a ~compile command that will run in your terminal while you are coding, listen for any changes when you save your .scala files, and auto-compile any changed files for you. You might also consider adding the Scala (sbt) plugin provided by Lightbend as well. It is a language server that provides auto-completion and error highlighting when running the ~compile command.
Initial scaffolding for a test file
A simple example of a unit test
After importing our dependency and declaring our package, we create class that extends ScalaTest’s FunSpec, which is simply a style of testing that we now have access to within our test file. ScalaTest provides several different styles to choose from, but this style will be very familiar to those who are used to Ruby’s Rspec or JavaScript’s Mocha testing suites. It allows us to write our tests in a flowing natural language fashion where we can describe the expected behavior of the program. Each individual unit test in our program is called using the it function which is wrapped in a describe block which organizes our unit tests by a description we provide to it. Within the body of the it function, we define an assertion. In this example, we are asserting that the result we expect should be 4 from the function we are testing, 2 + 2.
Testing the Utilities.addCommand function
An example of the message returned when a test fails
For additional information about writing tests using ScalaTest, including how to create setup and tear down functions that run before and after each test and the different possible assertions that you can call, visit the official documentation at www.scalatest.org .
Exercise 11-2
- 1.
Ensure that the function is pure and returns a value that can be tested.
- 2.
Refactor the nebula.scala file to print the result of the functions from the command pattern matching expression.
- 3.
Write a test for each function and ensure it passes.
Architecture Planning
The next risk you should seek to mitigate in your software engineering endeavors is the risk of uncoordinated or unorganized code among collaborating engineers due to lack of planning. You want to ensure that no matter how many engineers are working on your project, or who they are, what is written in your project is consistent, readable, and reusable throughout your code base. Oftentimes, this requires either a product manager, a software architect, or both to coordinate with a team to ensure that all code written by the team conforms to a set of quality and consistency guidelines. Architects or product managers also might provide a pre-ordained plan for how to tackle the project in question.
- 1.
Always specify a return statement, no implied returns.
- 2.
Always specify types, no implied types.
- 3.
Always use immutable variables and data structures.
- 4.
Curly braces are to open on the same line as the definition of the scope and close on the same level of indentation as the first line with all lines in between indented at least one level further.
These style guides are obviously based on opinion rather than any hard fast rule, but often it’s important for teams to agree upon them ahead of time in order to maintain a consistent code base. Code that does not follow the guidelines will still compile and run, so how might a team enforce such rules? With large, complicated systems, it is all but mandatory to have some form of version control system in place to keep track of changes to the software over time. Examples of such version control systems include Git and SVN (you might have heard of GitHub which contains cloud-hosted repositories of code that use the Git version control system). When making changes to code using a version control system, a team can put in place a mandatory peer review before allowing new code to be checked in to an existing project. It is during this peer review process that these style guides can be enforced for consistency. It is also a good opportunity to obtain feedback from peers among the team to ensure that your code is efficient and optimal, thus minimizing performance risk.
Software Deployment
The final risk to mitigate is the introduction of bugs into existing production software. For large enterprise software solutions that are deployed to a server farm for deployment, mitigating this risk involves using a specific deployment pattern to update production software over time without impacting existing users of that software. This pattern starts with testing code in a local or development environment, followed by a staging or quality assurance environment, and finalized by pushing to a production environment.
Local development environments are typically backed by databases with dummy data and partial code and services to minimize how much code a local computer might need to run (relative to the production server which might run a very large amount of code that could not be run completely on a local machine). The software engineer typically makes changes to this development environment, saves his or her changes in the version control system, and then passes the changes off to a peer for review. Once approved, the changes can then be sent to the staging environment for integration testing.
The staging environment is typically a mirror of the production environment. It has a separate database that is kept in sync with the production database to ensure a seamless experience when testing incoming changes from a development environment. The goal of the staging environment is to test the impact of changes from the development environment before they hit production. The staging environment, therefore, is used as a stop gap. If the development changes cause a negative effect on the staging environment, they can be rolled back to the previous version without ever affecting the production environment or the users that are using the production software. Typically, this step involves a quality assurance (QA) engineer who writes acceptance tests that must be satisfied before the staging environment can be promoted to production. These tests can be automated using software such as Selenium, but oftentimes they require manual testing as well. Once the QA engineer signs off, the software is then deployed to production.
For less critical software, deployment can follow a continuous integration process. Continuous integration (CI) is a deployment strategy wherein once a set of changes are checked in via the version control system, an automated process of tests kicks off to ensure the changes will not break any of the existing software. If all of the tests that have been set up in the CI test suite pass, the continuous integration system automatically deploys the new software to production. This type of deployment process is typically used for agile software development wherein teams are iterating rapidly on the software and need to deploy to production often.
Deciding which deployment process to choose typically involves understanding the risks and impacts to your software and business. Analyzing whether or not your software can be easily rolled back or backed up is an important consideration. Another is determining the amount of availability or uptime your software is required to have and whether there would be a significant impact to your business if your software were down for a particular amount of time due to maintenance or repairs during your deployment cycle. Understanding these principles along with fault tolerance, scalability, and distributed computing are all important in the modern era of software engineering.
Exercise 11-3
- 1.
Look up the formal specification of UML. Refactor the diagram of Weapon classes in the previous chapter using formal UML.
- 2.
Look up different version control systems. Familiarize yourself with their commands and common usages.
- 3.
Look up the various continuous integration systems and try to gain an understanding of how they are implemented. Examples of these systems are Jenkins, Circle CI, and Travis CI.
Summary
In this chapter, you learned that the difference between programming and software engineering is the amount of stringent process that surrounds engineering in order to ensure resilient and quality programs. Those processes included several key concepts. First, you were introduced to Big O notation as a means for measuring complexity. Next, you were given an introduction to unit testing as a strategy for ensuring edge cases and new code changes do not break your existing code. After that, you were introduced to architecture planning strategies that include UML diagramming, version control, style guides, and peer reviews. Finally, you were given an overview of the software deployment life cycle that is used to maintain and update production programs over time. In the next chapter, we will expound upon the engineering skills you learned in this chapter to dive deep into common data structures found in theoretical computer science to help optimize your programs.