Tuesday, March 30, 2010

Visit to University of Washington

Over spring break, Steph and I were in Seattle doing a variety of fun things to fill our time. Most importantly, the official graduate school visit days for UW Computer Science (from now on, UW refers to University of Washington, not any strange institutions in the Midwest).

I haven't really written much about the other visit days because I haven't had the time to come up with something coherent to say. All of the visits are of the same form: on the first day, spend your time alternating between short (20-30 minute) meetings with graduate students and potential advisors, and sitting through fire-hose presentations that are not terribly useful if you've read up about the department and school beforehand. The template for the second day (if there is one) is to do more "fun" tourist-like things in the city, and get to know the current graduate students or professors in a less formal context. Some of the things done to this end have included frisbee golf, drinking, snowshoeing, drinking, eating, and drinking.

So, when one gets to the third such visit, everything falls into a familiar pattern, and it can be at times difficult to put on a mask of sheer excitement when watching the 20th powerpoint presentation in as many days. Another familiar pattern was the faces and archetypes of other prospective students: by the third visit, I had already seen quite a few of them at other visit weekends. One such student is Shaddi Hasan, whom with I went drinking for two consecutive weekends in two different cities (Boulder, Seattle). After three weekends, you can pick up the differences between east-coast and west-coast students, who's trying to show off, and who is absolutely mortified to interact with strangers.

Day 1

In the first day, I met with lots and lots and lots of people. Of these people, two were professors (Prof. Ernst and Prof. Notkin) and the rest were graduate students. Interspersed with these half-hour meetings were various powerpoint presentations that showed the current research of a few professors. For lunch, I went with a group to a nice El Salvador-themed restaurant off The Ave. In the afternoon were some more meetings, and then finally after all this, there was a fancy dinner and graduate student party in the HUB (Husky/Student Union Building).

The dinner and reception were probably the better parts of the night; Steph was with me, and was able to socialize with some of the graduate students and get a better feel for who the department is in a social sense. We both sat near Dan Grossman for dinner and listened to his stories of hiking for an entire summer, among other interesting bits. One of the funnier parts of the evening was the CSE Band, which made alternate lyrics for pop and oldies songs and performed them live at the grad student party. I don't remember many other details from the party, but it was a terribly long day and I had been through a bit of beer by that point.

Day 2

The second day, as explained above, tends to be less hectic and more personal. At CSE, I actually spent most of the morning doing yet more meetings with people. This actually isn't so bad, since I was still running internally on Eastern Standard Time (hence, a 9am meeting was actually a noon meeting for me, and all was good). It was at this time that I met with Luis Ceze and Dan Grossman formally. Doing all of these formal meetings can sometimes be awkward, especially if you do not have a bunch of questions in your clip to use as ammunition. At this point, I had already talked to both professors the night previous, and of course have read every scrap of available information on the website.

After these last few meetings, Steph and I went to Hank Levy's house with the Security/Systems group for a nice lunch. His house is at Sand Point, between the Burke-Gilman trail and the water. I was able to speak with Alexei Czeskis, a Purdue CS graduate and one of the star undergraduates that my cohort looked up to as little freshmen and sophomores. After eating, I hung out on the dock and talked to some people about my JavaScript work, and about the challenges of JavaScript security. Later, I returned to the house where Steph was talking to none-other than Mark Zbikowski, employee #55 of Microsoft and architect of NTFS, Cairo, the MSDOS executable format, and other things. Apparently he's returning to graduate school ^_^ He had some interesting advice and stories to relate, and it is interesting to hear advice to a new hire at Microsoft from someone who has been there since the near-beginning.

After this lunch, we went back to the Paul Allen Center and went back out for some fun with the PL/SE groups (Notkin, Grossman, Ernst) for some [indoor] beach volleyball. This was way more fun than I thought it would be: professors diving face-first to make a save, trading jokes with potential advisors, and getting some exercise at the same time. While neither Steph nor me are much good at volleyball, we had a good time getting to know the personalities of the group a bit better. Afterwards, we killed some time in the park throwing around frisbees, and then headed home.

We were way too tired to go out after another full day, so we grabbed some quick food at Thai 65, tried out the hot tub at the hotel, and went to bed quite early.

--

Overall the visit went very well. I learned a lot about the department, the people whom I may be working with, and the current things that are going on research-wise. I was also able to get a sense for what projects will be available next fall, and possibly even in the summer if an internship never comes through. I still have a while before I'll feel as comfortable talking to new professors who don't know me as well, but I'm confident i'll fit right in with the people in Seattle.

In the next post, i'll talk about why Chicago was especially green, the pros and cons of using Amtrak vs. flying, and the rest of our spring break in Seattle.

Friday, March 26, 2010

MSR Video: Research Perspectives on JavaScript

Just today, Channel 9 (the MSDN channel that covers Microsoft Research) posted a video about "Research Perspectives on JavaScript", featuring Eric Meijer as interviewer and the Bens (Ben Livshits and Ben Zorn) as interviewees. This video is particularly interesting to me, because their JSMeter project is highly related to our PLDI paper this June (we even get several mentions in the video). I'll summarize some of the main points of conversation, but will omit some details as the video is quite long (50 mins). My comments/opinions will be interspersed and emboldened.

Eric begins by asking about the names. How do they always come up with funny names like Gatekeeper, JSMeter, and so on? Ben Zorn explains that it is important to pick a good name, because they tend to stick in people's minds better than paper titles (unless authored by Phil Wadler). I agree. That said, I would much rather have a boring paper title than a ridiculous backronym project name, the likes of which are way too common on large projects in the sciences.

Next, the JSMeter project is discussed. The project goal is to find out what exactly JavaScript code in the wild is doing, and how it compares to C, Java, or other languages. They instrument Internet Explorer's interpreter, and aim to measure the behavior of *real* applications that end-users visit every day.

As far as their methodology (as in, what data is actually recorded), it is very similar to what we did in our research. There are some differences: they measure physical heap usage of the DOM vs JavaScript whereas we only measure the objects on JavaScript heap (without respect to physical size of objects), and they measure callbacks and events explicitly. As far as analysis of the data, our approaches diverge according to our goals, but cover many of the same statistics.

The first point Ben Livshits talks about (and their main conclusion in the paper) is their observation that SunSpider and other JavaScript benchmarks do not have much in common with real-world applications such as GMail, Facebook, Bing Maps, etc. The second observation is that function callbacks are typically very short on sites that need high responsiveness. SunSpider is called out for having a few event handlers with very long execution times, which biases against interpreters that handle small functions well (such interpreter behavior is desirable in the real world).

I'm not objectively sure what "short" is, but we did see in our work that function size was fairly consistent with respect to static code size and dynamic bytecode stream length. We could not distinguish function invocations which were callbacks though, unfortunately. I also wonder if opportunities for method-level JIT'ing are overrepresented in SunSpider and (especially) V8 benchmarks.

There was some discussion of whether JavaScript as a language will evolve to be better, and also the tricky questions of what would one add or remove to the language. Ben Zorn points out that JavaScript, unlike Java or C, is usually part of a complex ecosystem in the browser. This means that ultimately, it may evolve, but only slowly and in lockstep with other complementary languages. He also calls it a "glue" language as opposed to a general purpose language, one that mainly deals with making strings and gluing together DOM and other technologies.

I agree that it is bounded by other technologies, but it can also be a general-purpose scripting language (see for instance its use in writing large parts of Firefox, and as a scripting plugin for many environments and games). I think the issue of poorly designed semantics (the root of all trickiness in efficient implementation) is orthogonal to the issue of whether it's a generally expressive and useful language. PHP is another language in this vein (apparently useful, but horrible semantics).

Erik asks about the use of dynamic features of JavaScript by developers. Ben Livshits immediately concedes that many people use eval, and while some of its uses are easy to constrain/replace with safe behavior (JSON parsing, for example), some are "more difficult". But, he does not see this as a very big problem because a lot of contemporary JavaScript code is written by code generators. Ben Zorn explains that with results from JSMeter and our work, researchers and implementors can gauge the impact of certain restrictions (such as saying "no eval" or "no with").

We actually approached it from the other end in an effort to investigate the implications of assumptions already made in JavaScript research. Since our sources and data are freely available on the project webpage, it's possible to go in the other direction as well by tinkering with our framework and replaying executions on an interpreter simulator with different semantics.

Our conclusions are a bit different in this area, as well. You can read the paper for more details, but shortly, we think static typing and static analyses for JavaScript will either be too brittle, too expensive (due to code size), or too flexible to make any useful guarantees. That said, we see lots of room for heuristic-based optimizations, which have already made inroads into implementations of Chrome and Firefox.

We learn that there is a dichotomy between sites that use frameworks and libraries, and sites that use handwritten code. We saw about half of the top 100 sites used an identifiable libary. Script size is also seen to be very large. Erik asks about the functional nature of JavaScript- do scripts often use higher order functions? They defer to our study for quantitative numbers (thanks for the mention) and say that usually it is frameworks and translators that use HOF's (for example, JQuery's extensive use of map/reduce patterns and chaining). Of course, callbacks and event handlers are one ubiquitous use of closures. Ben Livshits talks a bit about library clashes (i.e., different libraries may change built-in objects in incompatible ways), which they did some work to detect statically in other research. I know that Benjamin Lerner at UW has done some work in this space, in the context of Firefox plugins and how to make them play nicely together. He makes an anonymous jab at some news sites that sidestep such incompatibilities by loading separate widgets in their iframes (at the expense of horrible page load times).

Erik returns to the issue of language design: what would you like to remove or add to the language? Ben Livshits talks about subsets of JavaScript, and their usefulness for writing safe and limited code (such as in ads). This approach has been used in several papers, but does not yet seem to have much traction with browser vendors. It would be nice, though. In general, it is agreed that JavaScript needs fewer features, not more. I would start with removing the 'with' operator. Ben Zorn says that nothing needs to be added, because things like classes or other language features can be built on top of prototypes. That said, he is not convinced either way as to the usefulness of a prototype-based vs a class-based object system. Yeah, me neither. He then explains prototype-based objects and ways to optimize programs in this paradigm, such as V8's "hidden classes".

Ben Livshits says that the big strength and weakness of the Web are the same thing: being able to load code and data from disparate servers, and combine them fairly arbitrarily via very late binding. Predictably, the two Bens are at odds over whether this is a good thing or bad thing for the web in the large. On the one hand, having no rules makes it easy for a 12 year old (or a developer with the same skill level) to hack something together because the platform is so tolerant to broken code. On the other hand, this flexibility invites a lot of security problems and ties the hands of those more proficient developers who want more invariants and guarantees about their code. This lack of discipline and control is probably what drives companies to translate large programs into JavaScript from some other language like C# or Java.

One lesson of JSMeter that Ben Livshits talks about is the possible benefits of more integration between the language and the platform. Many times, browsers load the same page over and over, but do not learn anything about how that page actually behaved. Ben's example is that if code only runs for a few seconds, then it is not useful to run the garbage collector (as opposed to other methods, such as mass freeing by killing a process or using arena/slab allocation). Right now, browsers are utterly amnesic about what happened the last time (or 10, or 1000 times) they loaded a page, and only cache the source text on the browser (as opposed to the parsed AST). This is something that jumped out at me as well. Sounds like an interesting thing to look at. They talk about this again near the end.

Erik asks whether the parallel but separate specifications and implementations of JavaScript and complementary technologies like the DOM are necessary. Why not just make one specification to rule them all? Both Bens say that the border is fairly arbitrary, and increasingly applications pay a large price when they cross that boundary frequently. Ben Livshits also says that going forward, it is a bad idea to ignore these other technologies when thinking about analyses and optimizations. They did not suggest any specific methods for such cross-border optimizations, though.

This is a huge problem with tracing JIT's like TraceMonkey, because they have to end a trace whenever it leaves JavaScript for native DOM methods (usually implemented in C++). V8 (Google Chrome's JavaScript engine) tries to minimize the number of such exits by implementing most of the standard library of JavaScript in JavaScript, and using only a small number of stubs. Another approach may be to compile the browser with the same compiler that does the JIT'ing (say, with LLVM) and then there is less penalty for crossing the DOM/JavaScript execution boundary in machine code.

Ben Zorn goes as far to claim that JIT's can only go so far to improve JavaScript performance, and DOM interactions are lower-hanging fruit right now. He bases this on the fact that most scripts are not computation-heavy, either because they are interactive and wait on the user to create events, or spend most of the CPU time inside of native DOM methods. Ben Livshits thinks that one of the biggest challenges is that JavaScript (and web applications in general) are network-bound, instead of CPU or memory-bound. Essentially, download time and network latency dominate any other sources of delay.

I agree on the 'being interactive' part, but disagree on the compute-heavy part. Especially with things like games, the canvas element, and animations in JavaScript, numerical computation is starting to become significant. Furthermore, as JavaScript becomes more and more the 'assembly of the web', I would guess that the CPU time will tilt towards general-purpose computation, and away from DOM calls (which most significantly are used for the V in MVC).

--

It's great to hear at length from some of the other folks doing research work around JavaScript. I'm looking forward to seeing the final version of the JSMeter paper at USENIX Webapps, and also am looking forward to our paper being presented at PLDI in June. Every time our work is presented, we get lots of new and diverse feedback, which raises ideas we have not yet considered and forces us to dig deeper into our understanding and data.

Thursday, March 25, 2010

A tale of three bikes

Spring is (nearly) here, and in Indiana that means it's time to dust off the bike saddle and begin the time-consuming process of tuning, upgrading, and repairing bicycles. Wait, bicycles is plural, right? Let me review the three bicycles that Steph and I share:

  • The fixie: Steph rides a red fixed-gear bike mostly around town. I confess to not knowing it's particular details, as I'm not a fixed-gear enthusiast. All I know is that it would not serve me well on Chauncey and 9th St hills.
  • The frankenbike: My father had an old Trek bike (circa early 90's) hanging up in the garage, so I was allowed to borrow this bike for the school year. I've done some good rides on it (20-40 miles) in the past semester, but it lay largely unused during the winter months due to Purdue's propensity for poor road maintenance.
  • The Felt: Steph bought a Felt Z80 two years ago on a whim after her just-purchased-that-day bike was totalled in a car-bike accident. Unfortunately for her, she didn't know much about bikes and picked one that she no longer enjoys riding. So, I may end up using it as my main road bike.
All three bikes have required some service prior to use this season:
  • The fixie had a flat tire, which needed to be patched. Thankfully, Steph is very good at patching tubes, so this was not a big problem. The chain was also cleaned a bit. It is remarkable how little maintenance this bike needs compared to the other two geared bikes!
  • The frankenbike needed a lot of cleaning: aside from the chain, I doubt that any part of the frame or the components had been cleaned since Bill Clinton was trying to pass health care reform. Of course the chain needed cleaning too, but that is because I rode several hundred miles in the fall semester with minimal cleaning. Beyond last semester's slight upgrades (new handlebar tape and front/back lights), I bought 2 new tire treads and had both wheels trued at Hodson's Bay.
  • The Felt bike has not really been used for substantial road biking, so I have decided to try it out on longer rides to see if I like it. Aside from the dirty chain that I had to clean, there was little to be done to make the bike rideable.
For Christmas this year, I received some more biking gear as a gift from Dad: a pair of biking shoes, and some very nice clipless pedals. At first, I was planning to put these on the frankenbike, but realized that I might as well put them onto the Felt bike if I was going to only use the Felt bike for long rides. (For those who don't know, clipless pedals and shoes work similarly to ski bindings. The main purpose is to keep your foot attached to the pedals for better efficiency.)

It took me the better part of 3 hours to assemble my shoes, remove pedals from the Felt bike and frankenbike, and install the new pedals. For now, I put the cheapo plastic $3 pedals on the frankenbike, but might change back to the old, dirty, metal pedals that used to be on it. Perhaps next time I want to take off old dirty pedals, I should buy a pedal spanner wrench, because it took an unbelievable amount of force to remove the damn pedals with a US 5/8 wrench (not metric 15mm as it's supposed to be).

Once assembled, I went out for a short (16mi) ride up S. River Road in West Lafayette, taking the huge hill on N 500 W (11% grade for about 1200 ft), and returning by Lindberg Rd/Salisbury. Overall, using the Felt bike is much more fun than the frankenbike: it is much lighter, has a larger cassette and 3 chain rings instead of 2, and has a Shimano STI shifter (indexed shifting), wheras the frankenbike has old-school knobs (friction shifting) that you have to manually move up and down and hope the derailer has moved to the correct position.

If you don't know what these terms mean, imagine the difference between fretted and notfretted string instruments. With a violin, you must know exactly where to place your fingers on the string to make the correct noise- this is like how shifting on the frankenbike works. On the Felt bike, distance between gears is fixed (like the distance between frets on a guitar).

My goal for the rest of the semester is to do some sort of exercise at least 3 days a week, on Monday, Wednesday, Friday. If the weather permits, I'd like to do short (20-30mi) rides. If it does not permit, I can go to the pool for a while instead. Eventually, I'd like to be able to do a 50 or 100 mile ride; most of those are in May or later, so I have a lot of time to train and work up on my mileage. I never got much beyond 35 miles in one ride last semester, but I have a lot more time in my schedule for long rides now. Or so I hope.

Tuesday, March 23, 2010

Our WebKit instrumentation and PLDI dataset has been released!

In preparation for the PLDI camera-ready deadline later this week, Gregor and I have finally gotten around to packaging and uploading the files we used in our experiments.

The first set of files is the raw sources of the instrumented WebKit branch, the trace analyzer and static analyzer, as well as the database generation and graphing infrastructure.

The second and third downloads are the traces that we used as our raw data set in the PLDI paper, as well as the resulting database when these traces are analyzed and inserted into a sqlite3 database. These files are fairly large, but we feel that it is important to allow your experiments to be recreated by 3rd parties. One frustration of working in the programming languages research field is that all too often, if a technique or analysis is tested by an implementation, the corresponding sources and datasets used are not made publicly available. This makes it impossible to verify results, look for possible improvements, or find inaccuracies in papers. We want to do our part to combat this trend by providing everything that we used in the development of our work's conclusions.

The files are hosted at Gregor's Purdue website: http://www.cs.purdue.edu/homes/gkrichar/js/

For those keenly interested in building and running the project themselves, there is a somewhat useful README in the top-level directory of the source tarball. Happy hacking!

Friday, March 12, 2010

Code Bubbles (aka super-Eclipse)

This post begins a hopefully-weekly blog post about research that I have recently read or found out about. Posts may be written sporadically, but hopefully can be posted on a regular schedule.

Over in software engineering-land, a new paper at ICSE 2010 (May 2-8, Capetown, S. Africa) unveils the idea of Code Bubbles.

If you go to the site, there is a 1080p YouTube demo of the idea. Code Bubbles is an IDE where related bits of context can all be viewed together spatially, regardless of source file or class. These independent code snippets are graphically in separate "bubbles", and can be grouped together according to task. Additionally, things like emails, notes, Javadocs, and other assorted content can be loaded into a "bubble" and grouped with related content. The whole system seems to take the concept of "working sets" to its logical conclusion.

One common task in Java IDE's such as Eclipse is to read some code, recursively look up some declarations of types or methods, and then write some new code that incorporates all of these different types and methods. A similar process happens when a developer tries to find and fix a bug: starting from a stack trace or error message, one must trace through the execution of the program, possibly through many disparate classes, methods, and so on. In Eclipse and similar IDE's, this task involves opening many files as tabs in the same workspace; after trying one particular program trace, the windows must be closed individually, and if one finds a good combination of views that expose the bug, there is no way to serialize it so that others can also use the view.

Code Bubbles could potentially make the above situation much nicer: it has debugger support, which can automatically open relevant code snippets from the call stack at any point during execution. One of the best features in my mind is the ability to import and export working sets (groups of bubbles) by email. Coupled with some design explanations, a bunch of pre-determined snippets of code could go a long way towards documenting the architecture and important parts of a complex program. Other cool features I noticed include
  • Traditional class/method browser- from which, any method can be pulled out into its own bubble.
  • User queries- I didn't get a good idea of how this worked, but it seemed to be a fulltext search on method, variable, class, type names (similar to the Awesome Bar in Firefox).
  • Labeling of bubble groups and workspace areas
  • Zoom in/out for rearranging groups

Just looking from the video, i'm quite impressed by the amount of polish present (for a research work). Their framework is built on top of Eclipse, and uses its backends for the dirty work (parsing/syntax highlighting, static analysis, editor UI). I'm interested in the details as well- they will post a PDF after the conference is over in May.

• • •

I think there are plenty of exciting directions to go from this work. For one, it would be useful to find out through practical experience when this sort of interface is useful and when it is not. My guess is that this will be great for discovering a new codebase or hunting for bugs, but not as useful for writing new code. It also seems to require a 24"+ monitor to really shine (which is fine, because there is no other practical use for such large desktop monitors).

One idea that struck me is that since the workspace is so large, what would happen if multiple people could simultaneously share the workspace and see each other's changes in real-time? I don't know if such a scheme would work as well in an Eclipse-based application (cf. web-based collaborative editing such as CodeCollab and collabedit, or systems like SubEthaEdit built from the ground-up to support distributed/collaborative editing). It would also pose interesting (read: nontrivial) questions about the integration of version control into such a system, especially if there is no notion of who "owns" a certain version of a file.

The last point I want to address is the interesting yet puzzling response to the project on the rest of the Internet. On a Slashdot story about the project, the overall responses were neutral to negative. On a similar LTU story, almost all the responses were positive. This is surprising to me, because it is usually the reverse: LTU'ers complain about IDE's reinventing Smalltalk, while Slashdotters coo over pretty graphics. I guess the one thing that helps to explain the discrepancy is the use of Eclipse and Java, which (somehow) has a better reception at a programming languages research blog.