Software – Intro to New Media

What to watch for

After completing this lesson, you’ll be able to:

Define (at a high level) what computer software is and how it works
Explain what a programming language is and why multiple ones exist
Describe the history and culture of software development
Recognize prominent software-related terms and concepts

Required Reading:
“What is Code?” by Paul Ford

(31,558 words / 150-180 minutes)

Okay y’all, I need you to brace yourself. This required reading is nearly 32,000 words long. That’s like a 100-page book, more or less.

But it’s also freaking awesome.

In fact, it’s so awesome, that it’s all we’re going to read about software. There’s lots of other stuff you could (and should, if you’re interested in the topic) read, but this—this what? Novella?—really gets us just about everything I could ask for and more.

A few thoughts before you dive in:

If at all possible, try to read this in a desktop web browser. The mobile version is good, but the desktop version has so many fun goodies built in, it’ll make your head swim. (I honestly don’t know how Mr. Ford convinced Bloomberg Businessweek to make this happen.)
It’s possible to read all this in a single sitting, but it’s really not advisable. I think two to three solid sessions will be just right for most people.
Don’t just read to get through this—read to understand.1
But also, don’t feel the need to look up every single thing here (unless you really want to, in which case, great!). While I’m normally an enormous fan of looking up everything you don’t know when reading, if you try to do this here, you won’t emerge from the internet for several weeks—possibly longer.
With this reading, I’m mostly just going to pull quotes from it and share my thoughts on them—there’s just so much good stuff here.

Okay, go knock yourselves out! I’ll be here, waiting.2

We are here because the editor of this magazine asked me, “Can you tell me what code is?”

“No,” I said. “First of all, I’m not good at the math. I’m a programmer, yes, but I’m an East Coast programmer, not one of these serious platform people from the Bay Area.” …I’m not a natural. I love computers, but they never made any sense to me. And yet, after two decades of jamming information into my code-resistant brain, I’ve amassed enough knowledge that the computer has revealed itself. Its magic has been stripped away. I can talk to someone who used to work at Amazon.com or Microsoft about his or her work without feeling a burning shame.

One of the reasons I love this article is the author’s perspective. He knows plenty, but he empathizes with folks who find computers and code a bit mystifying.

There are 11 million professional software developers on earth, according to the research firm IDC. (An additional 7 million are hobbyists.) That’s roughly the population of the greater Los Angeles metro area.

That’s a lot of people, no?

There have been countless attempts to make software easier to write, promising that you could code in plain English, or manipulate a set of icons, or make a list of rules—software development so simple that a bright senior executive or an average child could do it. Decades of efforts have gone into helping civilians write code as they might use a calculator or write an e-mail. Nothing yet has done away with developers, developers, developers, developers.

Thus a craft, and a professional class that lives that craft, emerged. Beginning in the 1950s, but catching fire in the 1980s, a proportionally small number of people became adept at inventing ways to satisfy basic human desires (know the time, schedule a flight, send a letter, kill a zombie) by controlling the machine. Coders, starting with concepts such as “signals from a keyboard” and “numbers in memory,” created infinitely reproducible units of digital execution that we call software, hoping to meet the needs of the marketplace.

About as clear of a description of software development as you could ask for.

You should probably read Section 2.3, “How Does Code Become Software?”, at least twice. This is the crux upon which your understanding of the rest of the article stand.

Computers usually “understand” things by going character by character, bit by bit, transforming the code into other kinds of code as they go. The Bolus compiler now organizes the tokens into a little tree. Kind of like a sentence diagram. Except instead of nouns, verbs, and adjectives, the computer is looking for functions and arguments. Our program above, inside the computer, becomes this:

Trees!

Every character truly, truly matters. Every single stupid misplaced semicolon, space where you meant tab, bracket instead of a parenthesis—mistakes can leave the computer in a state of panic.

Just remember this when you get to NMIX 4010; you’ll find this out first-hand, repeatedly.

It’s a good and healthy exercise to ponder what your computer is doing right now.

Yes it is!

Algorithms don’t require computers any more than geometry does. An algorithm solves a problem [and a] programming language is a system for encoding, naming, and organizing algorithms for reuse and application. It’s an algorithm management system.

At this point in the article, we’ve gone through a brief refresher on hardware (including data) and developed a basic understanding of how computers process data. Now, we’ve arrived at programming languages, which is the next step up the ladder (Hardware -> Data -> Processing -> Algorithms -> Programming languages).

A bit more about algorithms and computer science:

One thing that took me forever to understand is that computers aren’t actually “good at math.” They can be programmed to execute certain operations to certain degrees of precision, so much so that it looks like “doing math” to humans Dijkstra said: “Computer science is no more about computers than astronomy is about telescopes.” A huge part of computer science is about understanding the efficiency of algorithms—how long they will take to run. Computers are fast, but they can get bogged down—for example, when trying to find the shortest path between two points on a large map. Companies such as Google, Facebook, and Twitter are built on top of fundamental computer science and pay great attention to efficiency, because their users do things (searches, status updates, tweets) an extraordinary number of times.

An excellent summary of what programming is and what a programming language tries to do:

The hardest work in programming is getting around things that aren’t computable, in finding ways to break impossible tasks into small, possible components, and then creating the impression that the computer is doing something it actually isn’t, like having a human conversation. This used to be known as “artificial intelligence research,” but now it’s more likely to go under the name “machine learning” or “data mining.” When you speak to Siri or Cortana and they respond, it’s not because these services understand you; they convert your words into text, break that text into symbols, then match those symbols against the symbols in their database of terms, and produce an answer. Tons of algorithms, bundled up and applied, mean that computers can fake listening.

A programming language has at least two jobs, then. It needs to wrap up lots of algorithms so they can be reused. Then you don’t need to go looking for a square-root algorithm (or a genius programmer) every time you need a square root. And it has to make it easy for programmers to wrap up new algorithms and routines into functions for reuse.

The next few bits of the article go into the culture of programming. This isn’t fluff—how this stuff gets made influences what is made.

The bits about why there are a bunch of different programming languages are important, too. The short answer is that different languages optimize for different things to attempt to solve different problems.

Then, the article digs back into some core programming concepts. Here’s a bit about standard libraries:

The true measure of a language isn’t how it uses semicolons; it’s the standard library of each language. A language is software for making software. The standard library is a set of premade software that you can reuse and reapply.

This is hugely important. As you’ve hopefully grasped by now, there are layers and layers and layers of stuff involved in making a computer work. If you had to do all of it all over again every time you wanted to do something with a computer, it’d be akin to having to reformulate the chemistry for concrete, redesign nails, plant a new forest and build a new sawmill (and lumber delivery service), etc. every time you wanted to build a new house. The more of these problems a programming language or software development platform takes care of for you, the higher the level of abstraction it’s said to have.

Building on this idea (only tangentially related to the DRY (Don’t Repeat Yourself) principle already mentioned in the article) is the crucial concept of object-oriented programming:

Object-oriented programming is, at its essence, a filing system for code. As anyone who’s ever shared a networked folder—or organized a physical filing cabinet—knows, without a good shared filing system your office will implode. C, people said in the 1980s and ’90s, is a great language! An excellent language! But it doesn’t really let you organize things. You end up with all these functions. It’s a mess. I mean, we have this data structure for our customers (name, address, and so forth), and we have all these functions for manipulating that data (update_address, send_bill,delete_account), but the thing is, those functions aren’t related to the data except by the naming convention. C doesn’t have a consistent way to name things. Which means it’s hard to find them later. Object-oriented programming gave programmers a great way to name things—a means of building up a library. I could call (run) update_address on a picture of a dog or an Internet address. That approach is sloppy and dangerous and leads to bugs (our forebears reasoned, and not without precedent), and it makes it hard to program with big teams and keep track of everything.

So what if, whaaaat if, we made a little box called Customer (call it a “class,” as in the taxonomical sense, like a Customer is a subclass of the species human, which is a subclass of mammal, etc.), and we put the data and methods relating to customers into that box. (And by box, it’s literally just “public class Customer {}” and anything inside the {} relates to that particular class.)

I mean, you wouldn’t even need to look inside the box. You’d just download a large set of classes, all nested inside one another, study the available, public methods and the expected data, and start programming. Hey, you’d say, let’s put some data into our object, take some data out. Every time we have a new customer we make a new instance of our class. Code can be a black box, with tentacles and wires sticking out, and you don’t need to—don’t want to—look inside the box. You can just put a couple of boxes next to each other, touch their tentacles together, and watch their eldritch mating.

This works out very well, in theory.

Moving back to the people who do software development, note this quote from the section on “10x programmers”:

Programming is a task that rewards intense focus and can be done with a small group or even in isolation.

And shortly following that:

As a class, programmers are easily bored, love novelty, and are obsessed with various forms of productivity enhancement.

The next sections on data are essential, too:

Data comes from everywhere. Sometimes it comes from third parties—Spotify imports big piles of music files from record labels. Sometimes data is user-created, like e-mails and tweets and Facebook posts and Word documents. Sometimes the machines themselves create data, as with a Fitbit exercise tracker or a Nest thermostat. When you work as a coder, you talk about data all the time. When you create websites, you need to get data out of a database and put them into a Web page. If you’re Twitter, tweets are data. If you’re the IRS, tax returns are data, broken into fields.

Data management is the problem that programming is supposed to solve. But of course now that we have computers everywhere, we keep generating more data, which requires more programming, and so forth. It’s a hell of a problem with no end in sight. This is why people in technology make so much money. Not only do they sell infinitely reproducible nothings, but they sell so many of them that they actually have to come up with new categories of infinitely reproducible nothings just to handle what happened with the last batch. That’s how we ended up with “big data.” I’ve been to big-data conferences and they are packed.

A helpful introduction to databases:

The most prevalent is the relational database, using a language called SQL, for Structured Query Language. Relational databases represent the world using tables, which have rows and columns. SQL looks like this:
SELECT * FROM BOOKS WHERE ID = 294;
Implying that there’s a table called BOOKS and a row in that table, where a book resides with an ID of 294. IDs are important in databases. Imagine a bookstore database. It has a customer table that lists customers. It has a books table that lists books. And it has a clever in-between table of purchases with a row for every time a customer bought a book.

Congratulations! You just built Amazon!

Be sure to read the brief section (5.4) on Microsoft.

Also, make sure you understand the difference between Java (and the not-unrelated term “enterprise software,” discussed elsewhere in the article):

Java is a programming language that was born at Sun Microsystems (R.I.P.), the product of a team led by a well-regarded programmer named James Gosling. It’s object-oriented, but it also looks a lot like C and C++, so for people who understood those languages, it was fairly easy to pick up. It was conceived in 1991, eventually floating onto the Internet on a massive cloud of marketing in 1995, proclaimed as the answer to every woe that had ever beset programmers. Java ran on every computer! Java would run right inside your Web browser, in “applets” (soon called “crapplets”), and would probably take over the Web in time.

And JavaScript:

Remember Netscape, the first huge commercial Web browser? In 1995, as Java was blooming, Netscape was resolving a problem. It displayed Web pages that were not very lively. You could have a nice cartoon of a monkey on the Web page, but there was no way to make the monkey dance when you moved over it with your mouse. Figuring out how to make that happen was the job of a language developer named Brendan Eich. He sat down and in a few weeks created a language called JavaScript.

JavaScript’s relationship with Java is tenuous; the strongest bond between the languages is the marketing linkage of their names. And the early history of JavaScript was uninspiring. So the monkey could now dance. You could do things to the cursor, make things blink when a mouse touched them.

But as browsers proliferated and the Web grew from a document-delivery platform into a software-delivery platform, JavaScript became, arguably, the most widely deployed language runtime in the world. If you wrote some JavaScript code, you could run it wherever the Web was—everywhere.

And be sure that you at least somewhat3 understand what Node.js is.

PHP (which stands for PHP: Hypertext Preprocessor (it’s a recursive acronym—programmers love silly jokes like this) is hugely important—be sure you have at least a fundamental understanding of its place in the coding ecosystem.

Next, be sure that you have a reasonable grasp of what an IDE (Integrated Development Environment) is and does (including the example used in the article, Xcode), as well as a general understanding4 of what SDKs (Software Development Kits) and frameworks are, too.

I love the bits in this article about debugging and testing.

Version control, as the article notes, is absolute magic. You need to understand it, and what GitHub is.

Okay, we’re almost there, and if I’m honest, you’ve made it through the tough stuff. Don’t skip the bits about software development processes, though. You need to be able to understand everything (including—especially!—the jargon-y words) in the following paragraph:

They will do their standups. And after the standups, they will go off and work in the integrated development environments and write their server-side JavaScript and their client-side JavaScript. Then they will run some tests and check their code into the source code repository, and the continuous integration server will perform tests and checks, and if all goes well, it will deploy the code—perhaps even in August, in some cloud or another. They insist that they’ll do this every day, continuous releases.

And with that, you’re almost done! There’s a nice coda about whether or not you should learn to code, and then finally, you’re presented with this:

So, that was, admittedly, kind of a weird way to learn about software, but I think it’s also probably the best possible thing we could’ve done. You pick up all the vocabulary along the way (and if you didn’t get it all the first time, read back through the piece again. Oh, and you are taking notes on all this, right?), plus you’re exposed to the history and culture of software, too.

You might have questions or want clarification about things in the article. Start by discussing things in your group on Slack, and if y’all are collectively confused, then by all means, let me know, and I’ll hop in and do my best to help.

Discussion questions

Share your honest feelings about “What is code?” Yes, of course it’s long, but what else? (Feel free to share both good and bad!)
What was your favorite fun / interactive thing in “What is code?”
Have you ever done any coding / programming? If so, when and in what language(s)?
Do you know anyone whose job is working with code? Talk to her / him after reading this lesson; ask them a bunch of questions, and share what you thought was interesting from your conversation with the group.
If you want to start learning how to code, you can check out Codecademy (and take NMI courses, of course!). I recommend starting with HTML and CSS and then moving on to JavaScript.
Think (and share) about what your computer is doing right now (in as much detail as possible) while you’re reading this question.
Try to explain what an algorithm is.

Words on / reading time for this page: 2,970 words / 15-20 minutes

Words in / reading time for required readings: 31,558 words / 150-180 minutes

Total words in / reading time for this lesson: 34,528 words / 165-200 minutes

True for the course in general, too, of course!↩
If you’re the kind of person that notices this sort of thing, there’s a reason that I didn’t include the whole thing in the “Required Reading” container. Why? It’s easier to read—that’s all!↩
Even a teeny-tiny bit↩
It really doesn’t have to be too precise—just a loose concept of what they are / do is okay↩

What to watch for

Required Reading: “What is Code?” by Paul Ford

Discussion questions

Required Reading:
“What is Code?” by Paul Ford