Structure, Language and Art

In a recent post tylera5 commented that the last time he wrote poetry was in high school, and wasn’t expecting to have to write a poem for a programming course. I got the idea for a poetry assignment from a friend of mine who teaches a biological science course. She found that the challenge of condensing a technical topic into a 17 syllable Haiku really forces one to think critically about the subject and filter through all the information to shake out the key concept. And poems about tech topics are just fun to read!

I think the benefit is even increased for a programming course. As tylera5 mentioned, both poems had a structure, and he had to think a bit about how to put his thoughts into the structure dictated by the poetry form, whether it be the 5/7/5 syllable structure of a Haiku, or the AABBA rhyming scheme of a limerick.

Poetry is the expression of ideas and thoughts through structured language (and the structure can play a larger or lesser roll depending on the poet, and type of poetry). Programming also is the expression of ideas and thoughts through structured language. The domain of ideas is often more restricted (though not necessarily, this article and book could be the subject of a full post in its own right) and adherence to structure is more strict, but there is an art to both forms of expression.

Are there artistic and expressive tools in other STEM topics as well?

Response to “Class Material – reposted”

In his post titled Class Material – reposted, zickbe asked a very good question about the content of ECE2524.  This is a question that has come up at least once every semester, to paraphrase it is “Since there are modern GUI tools for Linux now, why are we learning all these old command line tools?” The example given in the post was a simple task of replacing all periods ‘.’ with commas ‘,’ in some text input.  Indeed, many graphical editors do have search and replace functionality that make this particular task quite easy.  So what’s the point of learning to do it from the command line?

There are two answers to this question, each from a different perspective.

You as the User

The first is probably the perspective you are all thinking about right now: you as a user of a general-purpose operating system, editing files, writing code, surfing the web, etc.  As we have seen already, Unix has a strong tradition as a platform for text manipulation (remember, its first use was as an OS to run a word processing system for the AT&T Bell Labs patent department).  When we store our data in plain text we have a large collection of powerful tools to manipulate and process that data.

Of course, when learning new concepts we start with simple examples.  One of the simplest ways we can manipulate text is with a literal substitution, for example “replace all occurrences of the word ‘cat’ with ‘dog’ “, or “replace all occurrences of ‘.’ with ‘,’ “.  Literal substitutions are used often enough that many graphical tools have implemented the feature into the interface.  Let’s say we have a file myfile.txt and we want to change all occurrences of ‘dog’ to ‘cat’, we could either use the terminal:

sed -i 's/dog/cat/g' myfile.txt

Or we could open myfile.txt in our favorite text editor, choose the menu option for “search and replace”, enter “dog” and “cat” in the appropriate field, click “ok” and we’re done.  For this simple case it seems like it’s hardly worth the brain-space to remember how to use sed.  Let’s kick it up a notch though.  When writing software applications we often have many files associated with one project.  What if we wanted to replace ‘dog’ with ‘cat’ across several files?  Using the GUI we would open each file in succession, click the menu that contained “search and replace” fill in our search and replace words, hit ‘ok’ and then repeat for the remaining files.  This is probably doable for a few files.  What about 100?  1000?

find project/ -name *.txt -exec sed -i 's/dog/cat/g' '{}' \+


find project/ -name *.txt | xargs sed -i 's/dog/cat/g'

The nice thing about this is that the amount of effort we put in is the same no matter how many files we want to process, whether it be 3, 100, 1000 or more.  Try doing 100 text substitutions in a GUI and you’re asking for a repetitive stress injury!

“Ok”, you’re saying to yourself “but how often am I working with hundreds of files at once? I usually just have one or two files I want to modify, it’s not too bad to navigate the GUI menu a few times to do text substitution.”  Let’s think of some more examples of text manipulation you might want to do. In my previous post I described the process I went through to compile a list of links to last semester’s projects. At one point I wanted to prepend each line with a ‘-’ character to generate a list in Markdown syntax.  I could have just manually added the character to each line, there were only 19, after all, but instead I used a sed command

sed -r 's/^(.*)$/- \1/g'

It didn’t really save me many keystrokes in this case, but it easily fit into the automated workflow I had set up to convert the list of urls to a nice HTML format suitable for posting on the blog. It’s also a task that would have become quite tedious to do by hand if there were more than the 20 or so items that I had. And if I wanted to do somethling a bit more complex like “prepend only the lines containing a url with ‘-’ but leave all others unchanged”

sed -r 's|^(.*https?://.*)|- \1|g'

Now I can selectively convert lines to a Markdown style list. This is much quicker for even medium sized files than scanning each line by eye to find urls, and then adding a ‘-’. Can your GUI do that? And of course, if I had a few, or a few hundred files that I wanted to process like this, I could use the same `find … -exec` or `find … | xargs … ` idiom I used above.

Another quick example: You are probably familiar with the two main styles of naming functions with multiple words: CamelCase and underscore_case

def myHelloFunction:

def my_hello_function:

Which style you use is largely a matter of preference, although sometimes when working on collaborative projects the project will define a particular style that you must adhere to. Let’s say you’ve been using one style for a few projects and then decide you want to switch (or you get a bunch of code from a friend who was using a different style, or… )

sed -r 's/([A-Z])/_\l\1/g'

Will convert CamelCase to camel_case. Doing the same automatic formatting in a GUI of your choice is left as an exercise for the reader.  A quick google search will turn up a sed command to do the reverse transformation.

The take-away from all of this is that while the examples we use in class may be simple enough that it just so happens that a GUI editor has implemented similar functionality the tools themselves are much more powerful. GUIs are great in that they make it really easy to do the things that the GUI designers planned for. However, they make it difficult or impossible to do things that the designers didn’t plan for.  In the case where you want to perform a text manipulation on a large number of files, or a complex manipulation on one or more files, the command line tools provide a solution where the graphical tools do not.

You as the Developer

But you’re not just any user are you? You are getting a degree in Computer Systems Engineering, and even if you plan to focus on hardware it is a guarantee that you will be writing software at some point (probably many points).  You may even write some software that needs to do text manipulation.  Perhaps a preprocessor for a compiler, or even your own text editor.  What if you want to build in some functionality to allow the end-user to do some text manipulation. Maybe a simple text substitution, or perhaps you’re writing an IDE and want to provide a menu option to automatically convert CamelCase to camel_case across a set of project files.  How would you implement this?  For these examples it probably makes sense to use the regular expression library of whichever language you are programming in, but even in that case, the expressions themselves will be the same as in the sed example.  In some cases you may actually want to spawn a child process running one of the sed commands from above directly (maybe you want to run a complex text manipulation on a large number of files that a user selects with a GUI and let the manipulation run in the background while the GUI is free to take additional requests from the user).


As you are working with the command line and working through the examples for this class remember to keep in mind the flexibility of the commands you are learning.  In many cases the examples will be so simple that the same functionality has been implemented in any of the popular graphical tools, but the command line version provides much more control and flexibility, as I hope these few examples have demonstrated.  Can you think of any other examples that could be done using command-line text manipulation tools but would be impossible in a general purpose graphical environment?

As I mentioned before, this question comes up every semester.  How could the material in the class be modified to make the power of the tools we learn more apparent?  Should more complex examples be included at the possible expense of clarity? More examples?  Was the explanation I gave here convincing?  If not, please explain why in the comments and I’ll do my best to revise!

What Makes Good Software Good?

The first day of class (ECE2524: Introduction to Unix for Engineers) I asked participents the  open ended question “What makes good software good?” and asked them to answer both “for the developer” and “for the consumer”.

I generated a list of words and phrases for each sub-response and then normalized it based on my own intuition (e.g. I changed “simplicity” to “simple”, “easy to use” to “intuitive”, etc.). I then dumped the list into Wordle to generate these images:

Good Software for the Consumer

Good Software for the Consumer

Good Software for the Developer

Good Software for the Developer

For a future in-class exercise I plan to ask participants to link the common themes that appear in these word clouds back to specific rules mentioned in the reading.