I have officially moved my blog away from blogs.lt.vt.edu to my own website: hazyblue.me, which eventually will host not only my blog, but also my teaching philosophy, CV and other professional- related tidbits. I hope that everyone will follow me over to the dark-side as I continue to write about education, technology and talking cows. I would especially like to hear feedback on my latest post (inspired by the likes of Janet Murray and Alfred Whitehead), but since I haven’t set up a commenting system yet, please respond via your own blog, a tweet, or an email directly to me!
Author Archives: lilengineerthatcould
And Now You’re an Anstronaut: Open Source Treks to The Final Frontier
There have been a couple of blog posts recently referencing the recent switch NASA made from Windows to Debian 6, a GNU/Linux distribution, as the OS running on the laptops abord the International Space Station. It’s worth noting that Linux is no stranger to the ISS, as it has been a part of ground control operations since the beginning.
The reasons for the space-side switch are quoted as
…we needed an operating system that was stable and reliable — one that would give us in-house control. So if we needed to patch, adjust, or adapt, we could.
This is satisfying to many Open Source/Linux fans in it’s own right: a collaborative open source project has once again proved itself more stable and reliable for the (relatively) extrodinary conditions of low Earth orbit than a product produced by a major software giant. Plus one for open source collaboration and peer networks!
But theres another reason to be excited. And it’s a reason that would not necessarily applied (mostly) to, say, Apple fanatics had NASA decided to switch to OS X instead of Debian. And that reason has to do with the collaborative nature of the open source movement, codified in many open source licenses under which the software is released. Linux, and the GNU tools, which together make up a fully functional operating system, are released under the GNU General Public License. Unlike many licenses used for commersial software, the GPL esures that software licenses under its terms remains free for users to use,modify and redistribute. There are certainly some strong criticisms and ongoing debate regarding some key aspects of the GPL, especially version 3, the point of contention mostly lies in what is popularly called the “viral” effect of the license: that modified and derived work must also be released under the same license. The GPL might not be appropriate for every developer and every project, but it codifies the spirit of open source software in a way that is agreeable with many developers and users.
So what does this all mean in terms of NASA’s move? We already know that they chose GNU/Linux for its reliability and stability over alternatives, but that doesn’t mean it’s completely bug free, or will always work perfectly with every piece of hardware, which after all is another reason for the switch: no OS will be completely bug free or always work with all hardware, but at least Debian gives NASA the flexibility of making improvements themselves. And there in lies the reason for excitement. While there is no requirement that NASA redistribute their own modified versions of the software, there is no reason to assume they wouldn’t in most cases, and if they do, it will be redistributed under the same license. It’s certainly realistic to expect they will be directing a lot of attention to making the Linux kernel, and the GNU tools packaged with Debian even more stable and more reliable, and those improvements will make their way back into the general distributions that we all use. This means better hardware support for all GNU/Linux users in the future!
And of course it works both ways. Any bug fixes you make and redistribute may make their way back to the ISS, transforming humanity’s thirst for exploring “the final frontier” into a truly collaborative and global endeavor.
Semester in Review
Well, as I’m about 6 hours in* into a 14 hour bus+train journey to Massachusetts I figured this would be a good time to reflect and respond to the past semester which seems to have flown bye.
The Blogs
I really enjoyed the blog assignment. Even though I wasn’t able to write a reply to every post I felt a lot more in sync with how the class as a whole was progressing. When there was confusion or frustration regarding a particular assignment, or just towards the class in general, I was able to respond quickly (I hope!). I feel I learned much more about how the material in ECE2524 was connected both to other courses and to events that interested you outside of coursework (open source gaming, personal server setups, commentary on Ubuntu as a general purpose OS).
There are a couple things I plan to change with the blog assignment with the end goal of adding a little more structure to the syndicated class blog, and hopefully encouraging more of a discussion.
- enforce “category” and “tag” rules. If you look down the right sidebar of the mother blogyou will see a list of all the categories posts have been made under. The current list is too long and not focused enough to be of any amount of use to someone trying to sift through the many posts for a particular topic. Most of the words used for “categories” should have been “tags” instead, so spending a little time up front talking about the difference I think would be helpful in the long-term organization and usefulness as an archival tool of the blog. Some categories I’ve thought of are:
- Introspective: reflect on the course itself, whether it be assignments, discussions or structure.
- Extrospective: explore connections between course material and using *nix systems or applying Unix design philosophy to other courses or events.
- Social Network: comment on and continue the discussion taking place at VTLUUG and VTCSEC meetings.
- Instructional: Discussing personal setups and/or workflows. Posts here will have sort of a “tutorial” or “howto” feel.
There are a couple optional assignments I want to offer that would be linked to blog posts:
- Learn a Language: There are many benefits to learning a new programming language. From The Pragmatic Programmer, Tip #8 “Invest Regularly in Your Knowledge Portfolio”:
Learn at least one new language every year. Different languages solve the same problems in different ways. By learning several different approaches, you can help broaden your thinking and avoid getting stuck in a rut. Additionally, learning many languages is far easier now, thanks to the wealth of freely available software on the Internet
Throughout the semester those opting to do this assignment would document their progress with their language of choice and share any new ways of thinking or problem solving gained by thinking outside their language comfort zone.
- Explore an Environment: An assignment suggested (I need to go through the list to recall who made it) has participants try out an alternative desktop environment and/or window manager. Learners participating in this assignment would make regular blog posts documenting their experience with a particular DE.
- VTLUUG/VTCSEC: There were some issues with the attendance implementation at VTLUUG (in particular) and VTCSEC meetings that frustrated a lot of people and made my life a little more difficult. In addition, an attendance count isn’t really a good metric for the success of this assignment since the purpose isn’t simply to sit in a room for an hour, but to engage with a larger community. Next semester credit will be counted towards the VTLUUG/VTCSEC assignment for blog posts containing targeted discussion and thoughts of the specific topics covered at each meeting.
Assignments
I noticed several people commented that the Inventory Management assignment was about the time when python and the motivation behind assignments started to “click”. I don’t mind that it takes a few assignments in before connections start clicking, but I would like to try and provide more motivation up front about where each assignment is headed, so that earlier along there is at least a notion of “this is going somewhere”. So I’ve been penciling out a clear, focused progression of assignments that goes from basic text parsing up to something like Inventory Management. That project in particular I am also going to make into a group project so that there is some exposure to using git as a collaborative tool before the final project. It also easily breaks up into sub-modules:
- Data Parser
- Command Parser
- Controller
As the name implies the two parsers make use of text parsing concepts, while the controller is more of an exercise in logical program flow. I think with clear enough specs on what the internal data structures should look like, the three parts should be able to be written mostly independently and then combined into one project.
I would also like to start C/C++ development earlier in the semester. I am going to try and restructure exercises and lecture slides so that C/C++ and Python assignments are interwoven throughout the semester. I hope that this will prevent the feeling that I got that the semester was split into two distinct phases, the “python” phase and “C++” phase. That way the content can follow a logical flow and touching on the merits of each language. A brief example of what I’m thinking about:
- simple line parsing (one primitive type, e.g. double/int per line)
- in python
- in bash
- in C++
- processing command line arguments
- in python
- in bash
- in C++
- parsing text lines into an array structure
- you get the picture
- parsing text lines into a hierarchical structure (e.g. command parser)
- probably drop bash for this case
- manipulating lists
- python list comprehension
- C++ stl algorithms
- Inventory Management (python)
And I am toying with the idea creating a similar progression (overlapping mostly) that will cover fork/exec, basic IPC with pipe and lead to a simple shell. As I mentioned in the “Think about it” of the pipeline assignment, all were missing to create a basic shell program was a string parser that would parse something like “generator | consumer” into an array. Along those lines, I may adjust example code in the “Make a Makefile” assignment to use flex/bison to generate a simple command parser instead of an arithmetic parser.
As those of you familiar with bash are aware, as the complexity of the algorithms and data structures we work with increase, at some point bash will become overly cumbersome. At this point, it will be relegated to the task of writing unit tests of sorts for each assignment (Thanks to George for the suggested assignment.) This will make bash a more integral part of the course material, there was a notable lack of bash this past semester, which I regret.
Classroom Time
I’ve been doing a lot of thinking about how to use the classroom time effectively in a way that makes everyone want to come. I think it’s really important that everyone shows up regularly, not just those that feel they need some extra guidance, but also those who have been programming in a *nix environment for 10 years. It’s really important because both the novice and expert can learn a lot from each other if they’re in the same room. It also makes my job easier. There are 60 people enrolled in the class in the Spring: it will be nearly impossible for me to check with everyone individually every time there is a typo in an entered command. Getting a second set of eyes looking at everyone’s commands and code will help people avoid extended debugging sessions and make people more aware of common typos and bugs. To that end I would like to do more collaborative discussions in the classroom, and less of me talking. Regarding assignments, I’d like them due and committed to a network accessible git repo at the beginning of class. Then, in class people will pair up, fork each others’ assignment, review, make edits, and initiate a pull request so that the original author can merge in any bug fixes. The grade for the assignment will be determined by a combination of the functionality of the original commit and the merged changes. This probably won’t take place after every assignment, but at least a view of them.
Depending on how efficient we become at fork/review/merge, I’d like to have more discussions like the one we had about the Process Object assignment. I will try to come up with 3 or 4 “make you think” type questions for each assignment and then in class break up into groups, each discussing one question in depth, then come together as a full class and share the response each group had.
Reflection
I think this post turned into more of a “What I plan to do next semester” more than the reflection I had intended. Because it’s probably already too long I’ll try and come to a close. The first semester I taught this course I pretty much followed the supplied lecture slides and exercises that were given to me. The second semester suffered from “all this stuff should be changed but I don’t have any rhyme or reason to it” syndrome (not unlike second system syndrome that Raymond talks about with regard to Multix). The next couple semesters, ending on the most recent, I have been tweaking and polishing and streamlining. There were still some bumps this past semester that I would like to eliminate (issues with VTLUUG attendance, problems submitting the midterm, lack of clarity on some of the assignments, much too long a delay on returning some of the graded assignments, to name a few), but I’m optimistic that the next revision will address many of them and hopefully provide a smoother and more enjoyable experience for all. Remind me to write another post about my vision for the class
*and now I’m 10 hours in… only 4 more to go!
Re: the little things of ubuntu
In a recent post thomaswy mentioned some things he liked about the CLI in Ubuntu (Linux in general, running bash in any distribution should yield an extremely consistent experience) and some things he disliked about the GUI. He’s not alone, just do a quick google search for “what I hate about Ubuntu Unity”. Luckily, there are numerous ways to resolve this. If you read the “Futures” chapter and other bits about the X-windows system in The Art of Unix Programming you learned that to remain consistent with the Unix design philosophy the designers of X created a clear separation between policy and mechanism. A result of this is several graphical toolkits available to developers who want to create a GUI, and a result of *this* is many different GUI environments. Unity is but one of them and just because it comes packaged with Ubuntu doesn’t mean that’s all Ubuntu can use. If you aren’t in love with Unity, consider some of the alternatives:
and because it didn’t make it onto the previous list:
And that is but a small sampling of the graphical environments available for Linux. A more complete list quickly becomes overwhelming
21 of the Best Free Linux Window Managers
and that still doesn’t include the one I use, i3.
It’s easy to see why Ubuntu, a distribution aimed at the casual user, would opt not to emphasize the amount of choices you have when it comes to picking a graphical environment!
And then many of the environments are further configured through themes and settings to control the look and feel and behavior for events like “click on a minimized window”. Yes, you can easily spend a day or more finding and configuring the “perfect” desktop. But that’s what makes Linux fun
Structure, Language and Art
In a recent post tylera5 commented that the last time he wrote poetry was in high school, and wasn’t expecting to have to write a poem for a programming course. I got the idea for a poetry assignment from a friend of mine who teaches a biological science course. She found that the challenge of condensing a technical topic into a 17 syllable Haiku really forces one to think critically about the subject and filter through all the information to shake out the key concept. And poems about tech topics are just fun to read!
I think the benefit is even increased for a programming course. As tylera5 mentioned, both poems had a structure, and he had to think a bit about how to put his thoughts into the structure dictated by the poetry form, whether it be the 5/7/5 syllable structure of a Haiku, or the AABBA rhyming scheme of a limerick.
Poetry is the expression of ideas and thoughts through structured language (and the structure can play a larger or lesser roll depending on the poet, and type of poetry). Programming also is the expression of ideas and thoughts through structured language. The domain of ideas is often more restricted (though not necessarily, this article and book could be the subject of a full post in its own right) and adherence to structure is more strict, but there is an art to both forms of expression.
Are there artistic and expressive tools in other STEM topics as well?
The Tides of Change?
As Linus Torvalds has mentioned in several video interviews, probably the main reason Linux has been lagging behind in the desktop market is that it doesn’t come pre-installed on desktop hardware, and the average computer user just isn’t going to put forth the effort to install a different operating system and configure it* than came with their new machine. Recently Dell caused a bit of excitement with their release of an Ubuntu addition of their “XPS 13: The Ubuntu developers” edition laptop. To be fair, this is not the first machine that Dell has offered with Linux pre-installed, but it does seem to be the first that they’ve tried pushing to the mainstream (or in this case, developer) community (in the past you really had to make an effort to find the Ubuntu option on their ordering form). Dell is also not the only desktop distributor to offer systems with Linux pre-loaded (indeed, many of the others exclusively offer Linux machines), but it is probably the brand with the most name recognition to the general audience. Could this be the beginning of the end of the Microsoft monopoly on the desktop OS market? I am optimistic!
*Be wary of the blog posts and forum comments that recount stories of installing Linux and being frustrated with the difficulty of getting all the necessary drivers for their hardware and using that as an argument that the OS wasn’t “ready” for prime time. If you have ever installed Windows on a fresh new machine you will be well aware that it can be just as frustrating. Windows doesn’t “just work” on the machines you buy because it is a superior OS (it isn’t), it works because the system distributors like Dell take the time to make sure that the necessary drivers for the particular hardware in the machine are all included.
Re: large scale makefiles
In a recent post thomaswy asked about maintaining makefiles for larger projects. It is certainly true that manually updating lists of dependencies for many (hundreds… thousands even, the Linux kernel is comprised of some 22,000 source files) can become tedious. Luckily there are tools that will generate Makefiles for you, though really their motivation is to automate the build process on a wide variety of machines and platforms. If you’ve used Qt for development you have probably been using the ‘qmake’ command. This is generate a ‘Makefile’ that is then read with a subsequent call to ‘make’. For more general projects GNU provides Autoconf and Automake. Along with a couple other programs these are referred to as the GNU Autotools. If you ever need to install a piece of GNU software from source (if a package isn’t available for your distribution, for instance) chances are it will use Autotools and you will build it with two commands:
$ ./configure $ make
The first command will generate the Makefile that is then used when you run `make`. In the process of generating the Makefile, `configure` will check your system for necessary libraries and tools and notify you if something needed is missing. This is also where you would specify optional build parameters. To get a list of options run
$ ./configure --help
If you decide you do want to make your project available to the open source community, it’s a good idea to set up the build process using GNU Autotools since folks will be expecting it.
Another option is cmake, which provides similar functionality to GNU Autotools and in addition has the capability to generate project files for a number of IDEs. A quick google search will turn up several commentaries on the merits of the two build systems (and probably several others).
If you do find yourself writing several Makefiles for larger projects (and even if you don’t), be sure to familiarize yourself with the issues raised in the now well-known paper Recursive Make Considered Harmful by Peter Miller.
Tmux quick-reference
I have added a quick reference to tmux session pairing in the Resources and References section under the Week 9 section. Please note it is not a detailed guide of all the aspects and security implications of using tmux to share your terminal session; make liberal use of Google or another search engine to get more details.
Response to “Class Material – reposted”
In his post titled Class Material – reposted, zickbe asked a very good question about the content of ECE2524. This is a question that has come up at least once every semester, to paraphrase it is “Since there are modern GUI tools for Linux now, why are we learning all these old command line tools?” The example given in the post was a simple task of replacing all periods ‘.’ with commas ‘,’ in some text input. Indeed, many graphical editors do have search and replace functionality that make this particular task quite easy. So what’s the point of learning to do it from the command line?
There are two answers to this question, each from a different perspective.
You as the User
The first is probably the perspective you are all thinking about right now: you as a user of a general-purpose operating system, editing files, writing code, surfing the web, etc. As we have seen already, Unix has a strong tradition as a platform for text manipulation (remember, its first use was as an OS to run a word processing system for the AT&T Bell Labs patent department). When we store our data in plain text we have a large collection of powerful tools to manipulate and process that data.
Of course, when learning new concepts we start with simple examples. One of the simplest ways we can manipulate text is with a literal substitution, for example “replace all occurrences of the word ‘cat’ with ‘dog’ “, or “replace all occurrences of ‘.’ with ‘,’ “. Literal substitutions are used often enough that many graphical tools have implemented the feature into the interface. Let’s say we have a file myfile.txt and we want to change all occurrences of ‘dog’ to ‘cat’, we could either use the terminal:
sed -i 's/dog/cat/g' myfile.txt
Or we could open myfile.txt in our favorite text editor, choose the menu option for “search and replace”, enter “dog” and “cat” in the appropriate field, click “ok” and we’re done. For this simple case it seems like it’s hardly worth the brain-space to remember how to use sed. Let’s kick it up a notch though. When writing software applications we often have many files associated with one project. What if we wanted to replace ‘dog’ with ‘cat’ across several files? Using the GUI we would open each file in succession, click the menu that contained “search and replace” fill in our search and replace words, hit ‘ok’ and then repeat for the remaining files. This is probably doable for a few files. What about 100? 1000?
find project/ -name *.txt -exec sed -i 's/dog/cat/g' '{}' \+
or
find project/ -name *.txt | xargs sed -i 's/dog/cat/g'
The nice thing about this is that the amount of effort we put in is the same no matter how many files we want to process, whether it be 3, 100, 1000 or more. Try doing 100 text substitutions in a GUI and you’re asking for a repetitive stress injury!
“Ok”, you’re saying to yourself “but how often am I working with hundreds of files at once? I usually just have one or two files I want to modify, it’s not too bad to navigate the GUI menu a few times to do text substitution.” Let’s think of some more examples of text manipulation you might want to do. In my previous post I described the process I went through to compile a list of links to last semester’s projects. At one point I wanted to prepend each line with a ‘-’ character to generate a list in Markdown syntax. I could have just manually added the character to each line, there were only 19, after all, but instead I used a sed
command
sed -r 's/^(.*)$/- \1/g'
It didn’t really save me many keystrokes in this case, but it easily fit into the automated workflow I had set up to convert the list of urls to a nice HTML format suitable for posting on the blog. It’s also a task that would have become quite tedious to do by hand if there were more than the 20 or so items that I had. And if I wanted to do somethling a bit more complex like “prepend only the lines containing a url with ‘-’ but leave all others unchanged”
sed -r 's|^(.*https?://.*)|- \1|g'
Now I can selectively convert lines to a Markdown style list. This is much quicker for even medium sized files than scanning each line by eye to find urls, and then adding a ‘-’. Can your GUI do that? And of course, if I had a few, or a few hundred files that I wanted to process like this, I could use the same `find … -exec` or `find … | xargs … ` idiom I used above.
Another quick example: You are probably familiar with the two main styles of naming functions with multiple words: CamelCase and underscore_case
def myHelloFunction: pass def my_hello_function: pass
Which style you use is largely a matter of preference, although sometimes when working on collaborative projects the project will define a particular style that you must adhere to. Let’s say you’ve been using one style for a few projects and then decide you want to switch (or you get a bunch of code from a friend who was using a different style, or… )
sed -r 's/([A-Z])/_\l\1/g'
Will convert CamelCase to camel_case. Doing the same automatic formatting in a GUI of your choice is left as an exercise for the reader. A quick google search will turn up a sed command to do the reverse transformation.
The take-away from all of this is that while the examples we use in class may be simple enough that it just so happens that a GUI editor has implemented similar functionality the tools themselves are much more powerful. GUIs are great in that they make it really easy to do the things that the GUI designers planned for. However, they make it difficult or impossible to do things that the designers didn’t plan for. In the case where you want to perform a text manipulation on a large number of files, or a complex manipulation on one or more files, the command line tools provide a solution where the graphical tools do not.
You as the Developer
But you’re not just any user are you? You are getting a degree in Computer Systems Engineering, and even if you plan to focus on hardware it is a guarantee that you will be writing software at some point (probably many points). You may even write some software that needs to do text manipulation. Perhaps a preprocessor for a compiler, or even your own text editor. What if you want to build in some functionality to allow the end-user to do some text manipulation. Maybe a simple text substitution, or perhaps you’re writing an IDE and want to provide a menu option to automatically convert CamelCase to camel_case across a set of project files. How would you implement this? For these examples it probably makes sense to use the regular expression library of whichever language you are programming in, but even in that case, the expressions themselves will be the same as in the sed example. In some cases you may actually want to spawn a child process running one of the sed commands from above directly (maybe you want to run a complex text manipulation on a large number of files that a user selects with a GUI and let the manipulation run in the background while the GUI is free to take additional requests from the user).
Summary
As you are working with the command line and working through the examples for this class remember to keep in mind the flexibility of the commands you are learning. In many cases the examples will be so simple that the same functionality has been implemented in any of the popular graphical tools, but the command line version provides much more control and flexibility, as I hope these few examples have demonstrated. Can you think of any other examples that could be done using command-line text manipulation tools but would be impossible in a general purpose graphical environment?
As I mentioned before, this question comes up every semester. How could the material in the class be modified to make the power of the tools we learn more apparent? Should more complex examples be included at the possible expense of clarity? More examples? Was the explanation I gave here convincing? If not, please explain why in the comments and I’ll do my best to revise!
List of Projects Past
I wanted to generate a list of Unix projects from last semester to provide a reference for what has been done before. The first complete list I found was in the forum of last semester’s Scholar site. There wasn’t a “download” option, and given the formatting I predicted that I’d spend more time writing a program to parse the raw HTML file and properly match up titles to urls than I’d save by just doing an old fashioned copy and paste operation for the 19 projects. So that’s what I did, and ended up with a file one title/url per line formatted like this:
Snake Game https://github.com/ccwertz7/2524snake Go, Dog, Go https://github.com/rokthewok/ECE2524-Final-Project ...
Just white space separating the title and the url. I wanted to generate a nicely formatted list of links to each project and also filter out any urls that were no longer valid, so I wrote a short python script that read in lines of test, split them at the ‘http’ of a possible url treating everythign to the left of the split as the title and then used the urllib module to check for a 200 status before printing out a markdown formated line:
[Snake Game](https://github.com/ccwertz7/2524snake) [Go, Dog, Go](https://github.com/rokthewok/ECE2524-Final-Project)
I have uploaded the source for this program to the examples directory in my ece2524 repository on github. After checking the output of my program I decided that I’d like to create a bulleted list by preceding each line with a ‘- ‘ (this is standard Markdown syntax). Rather than change my python script, which is already on the verge of being too specific to be useful for much else, I just piped the output through a simple sed command to tack on the dash:
url_200filter.py < Spring2012ProjectList.txt \ | sed -r 's/^(.*)$/- \1/g' \ | markdown > Spring2012ProjectList.html
To generate the HTML code that I then pasted here:
- Snake Game
- Go, Dog, Go
- Battleship Game
- Settlers of Catan
- Pyramid of Unix
- Guessing Game
- Timer Application
- Dodge the ASCII Characters Game
- Millionaire Game
- Terminal Game
- Checkers Game
- Text Adventure Game
- Monopoly
- Hangman
- ECECalculator
- Class Catcher
- Conway’s Game of Life
- Tic Tac Toe
- Leaderboards Hub
While you are thinking about project ideas keep in mind that the goal of the project is to apply and demonstrate understanding of the Unix design philosophy. A project doesn’t need to be terribly complex to show this, but if you have a complex idea that you would really like to explore it may be sufficient to just implement a piece of it for this project. I look forward to hearing about your ideas and discussing some details next week!