I'm moving. I'll be at recursed.org from now on.
03 July 2011
21 June 2011
Software Stabilization for Video!
Via Google Research:
Casually shot videos captured by handheld or mobile cameras suffer from significant amount of shake... Our goal was to devise a completely automatic method for converting casual shaky footage into more pleasant and professional looking videos.
Eschewing the promo video, I decided to try and run the technique on a video I'd shot from a moving golf cart, that was moving, while climbing a hill, after I'd had a beverage or few. I've added the software stabilized, and the original videos below. Really quite incredible:
Video Stabilized:
Original:
14 June 2011
Tim Bray on the Android Ecosystem
Went to a presentation by Tim Bray. Here are my notes:
Indication of the state of the Economy:
Vast majority of developers don't seem to be making any money. In general, here are the various ways to make money:
Multiple APK support: Can now provide multiple APKs that target different segments. Question from audience: Can you ship an app per carrier? Not sure.
Fixing the insane app count:
Direct Carrier Billing is 50% of revenue: "put it on my phone bill". Transparent to the developer. Didn't take with T-Mobile, but now going nuts in Asia. Every carrier wants this, but it won't happen quickly (two problems):
Indication of the state of the Economy:
Who's looking to hire people? Who's looking for a job? Stand up and talk to each other afterward.Explains his role: I'm an advocate, not an evangelist. Tell me about your experiences so I can take them back to the product group.
- More than 4BB mobile phones in the world today.
- Only 1BB PCs
- Only 5 years to get to 225MM users for iOS
Vast majority of developers don't seem to be making any money. In general, here are the various ways to make money:
- App sales
- App upgrades (Oracle is great at this)
- In-app advertising (this is a substantial driver of revenue for a lot of developers, banging the Google drum)
- In-app sales (this is a big deal)
- Subscriptions (trip-it for example, leverages server side platform. 37-signals)
The mobile industry wants you to think you're selling to a young urban hipster. Too many mobile apps are aimed at solving first-world problems. This is not just not-smart, but immoral. Note that the population in the third world for mobile phones is exploding. Don't scope your demographic too narrowly to just the US.Who's buying and installing apps: US is the biggest, next up Japan, next up Korea, next Germany and Britain.
Multiple APK support: Can now provide multiple APKs that target different segments. Question from audience: Can you ship an app per carrier? Not sure.
Fixing the insane app count:
Two hundred thousand apps and counting. Featured apps get a 25x to 50x spike in downloads. Adding in badges: Editor's Choice, Top Developer, Top Grossing etc. Should help distinguish apps.Question from audience: black-listing? Not a bad-idea.
Uninstalls for apps are very high value signals when it comes to rating an app.Frustrated question from audience: Why so many ways to rate things? No answer.
Direct Carrier Billing is 50% of revenue: "put it on my phone bill". Transparent to the developer. Didn't take with T-Mobile, but now going nuts in Asia. Every carrier wants this, but it won't happen quickly (two problems):
Some complaints from audience about FUD around DCB and latency to app deployment:
- Carriers have billing from the 1950s, so its a fierce engineering challenge
- When you do carrier billing, they show up with 3 engineers and 11 lawyers
Google's core competence never really included communication. But we're talking about things.What's coming in 3.0 and 3.1 (Ice cream sandwich.. no 'J' yet)
- new 'Holo' theme
- the Palm guy is the UI Czar
- Fragments: widgets that have a lifecycle within an activity. Helps during rotation.
- Really slick notification interface (ribbing at Apple)
- Menu bar is always on, but you can put it in "lights out mode", which blacks it out
- New Action Bar on the top: like a menu bar on a pc app, and its contextual based on your app
- Renderscript: C like syntax that will exec on the GPU, and runs on LLVM under the covers
- Much better animation
- HTTP Live Streaming (data rate sensitive with backoff)
04 June 2011
Visualizing ten months of work in under two minutes
Since wordsinmedia.com was written using Subversion as its version control system, I was able to run the wonderful gource visualizer on the version control logs. Here's the end result:
A view from the Oyster Dome
Mid way to the top of the Oyster Dome last weekend with the Ogden's. Four images stitched with Hugin. Click image to expand to a much higher resolution PNG.
24 April 2011
Masters in CS: worth it after professional experience?
I'll be defending my Master's thesis on May 6th, and if all goes well I'll be graduating with full pomp and circumstance a few weeks thereafter. Hoping that these notes on going back to school to get a Master's in CS after working for many years may prove useful to someone else, I figured it should be captured before I (hopefully, fingers crossed) graduate.
Quick Context
I'd been working for an organization that had treated me well for about six years. Things were good- I'd started out as a developer, almost fresh out of college. Over the years I got various other responsibilities. Eventually, I landed up leading a team that built and maintained a lot of software integral to the organization. This was 7 years after finishing my bachelor's degree.
Learning on my own
I used to busy myself with various projects outside of work. I'd build weather stations (which got hit by lightning sporadically), write my own Android apps, try and build strange home automation tools, and sometimes try to instrument my pets (never really worked).
Learning on the job
Put delicately, work had stopped challenging me as much as I would have liked it to. Problems existed that needed to be solved, but they seemed to follow similar patterns. The ones that scared me had been solved, or weren't solutions the organization needed to invest in anymore.
Between work and my own tinkering, I just didn't have the commitment to dig in and try and learn a whole bunch of things in a thorough way. I'd read up a lot, but there wasn't any accountability to really understand things in a way that was truly beneficial. Most of my reading/tinkering made me feel like I was a spectator to what was going on, where I really wanted to be a part of it.
Finding an institution
Seeing that most formal education offers a grounding in a wide array of topics, I felt that a Master's would be a good idea. I started to scour websites of various colleges and universities for information and details on admissions. My undergrad performance left a lot to be desired; my second year's grades read like output from a loop over Random.nextDouble(). But I hoped that my professional experience might offset that disaster. I started to contact various admissions departments for more information on programs, and for queries on admissions requirements. This proved to be really frustrating.
Tangential vent: who's the customer?
Someone who wants to go to a higher learning establishment is (reasonably speaking) in economic terms: a client. They pay the institution for a service: education. Yet most non-academic (ie: admissions, payments, records etc) departments at every university/college I dealt with treated me horribly- no returned calls, no real information, and never any emails with substance (if replied to at all). This was very disheartening. You're supposed to make a commitment of many tens of thousands of dollars* over two years to people you can't get straightforward and thorough details from about what their program has to offer?!
Distance learning?
Since my local options had run dry, or were just painful to deal with, I decided to expand my search to universities that provided distance learning. Not surprisingly, any college that provides distance learning also does a pretty good job of communicating with potential applicants via email**. I landed up getting admitted into Hofstra after learning more about their CS program.
Not easy, but rewarding
About 90% of my courses were really good. In general, the amount gained from a course was proportional to the amount invested by the professor. Surprisingly, I think that all parties (educators and students) have to put in a lot more effort in a distance learning setting than they would in a traditional classroom setting. Consider that your average course taught over distance learning requires that the professor create a video, slides, notes and provide references to various supplementary materials as part of a single lecture. Instead of office hours, you have discussion boards where everyone participates and sometimes the professor throws down a specific discussion point.
The net result is that you have a much higher bandwidth with your professor- you're in email contact with them often. Frequently, professors will provide their IM handle so you can IM them anytime too. You get to watch the lectures on your time: evenings, weekends or whatever works.
I landed up spending weekend after weekend on school for about two years. Most evenings in the week had me hunched over my laptop listening to lectures or answering questions. It isn't easy, but distance learning makes it possible if you have a job that's demanding. You have to develop the discipline to context switch completely though. I kept my "school" laptop on me at all times so that I could power up and watch a lecture or answer a homework assignment if I had any free time***.
Was it worth it?
Yes. Its been a really rewarding process. I've had a bunch of classes of which my favorites were:
Timing
I'm glad I waited to do my Master's after I had a bunch of work experience behind me. Encountering problems in real life (and sometimes solving them in a rube-goldberg fashion) allowed me to gain a lot more from school than I would have had I not encountered the problems before hand. I think you get a lot more 'ah-ha!' moments..
Consequences
My primary rationale for going to school was because I felt like I needed to be challenged with problems that I hadn't encountered before. This I got in spades.. And then then there was the character building from the volumes of homework, assignments, papers and such. But it also became glaringly apparent that this would be a finite engagement. Eventually you graduate (you hope!). I wanted to continue this education/masochism. To some degree then, this whole experience helped me realize that I needed to find a different job: where I would have to face a whole bunch of problems that would make me nervous and would have to learn a lot on the fly. And so, I started job hunting. But that's a different story..
And now, enough procrastinating and back to wrapping up my thesis so I can try and graduate...
Footnotes:
* Be prepared to invest between $20K to $40K for a good good distance learning program. Plan on getting a laptop that you'll use for the entire program too.
** Drexel and Hofstra were two that did particularly well.
*** The amount of energy you have to put in is a lot. Plan on getting hit for no less than 15 to 20 hours a week. Recovering once you fall behind is really hard, since the homework and discussion board posts all start to add up and create a huge backlog.
Quick Context
I'd been working for an organization that had treated me well for about six years. Things were good- I'd started out as a developer, almost fresh out of college. Over the years I got various other responsibilities. Eventually, I landed up leading a team that built and maintained a lot of software integral to the organization. This was 7 years after finishing my bachelor's degree.
Making weather stations |
I used to busy myself with various projects outside of work. I'd build weather stations (which got hit by lightning sporadically), write my own Android apps, try and build strange home automation tools, and sometimes try to instrument my pets (never really worked).
Learning on the job
Put delicately, work had stopped challenging me as much as I would have liked it to. Problems existed that needed to be solved, but they seemed to follow similar patterns. The ones that scared me had been solved, or weren't solutions the organization needed to invest in anymore.
Between work and my own tinkering, I just didn't have the commitment to dig in and try and learn a whole bunch of things in a thorough way. I'd read up a lot, but there wasn't any accountability to really understand things in a way that was truly beneficial. Most of my reading/tinkering made me feel like I was a spectator to what was going on, where I really wanted to be a part of it.
Finding an institution
Seeing that most formal education offers a grounding in a wide array of topics, I felt that a Master's would be a good idea. I started to scour websites of various colleges and universities for information and details on admissions. My undergrad performance left a lot to be desired; my second year's grades read like output from a loop over Random.nextDouble(). But I hoped that my professional experience might offset that disaster. I started to contact various admissions departments for more information on programs, and for queries on admissions requirements. This proved to be really frustrating.
Hand it over. |
Tangential vent: who's the customer?
Someone who wants to go to a higher learning establishment is (reasonably speaking) in economic terms: a client. They pay the institution for a service: education. Yet most non-academic (ie: admissions, payments, records etc) departments at every university/college I dealt with treated me horribly- no returned calls, no real information, and never any emails with substance (if replied to at all). This was very disheartening. You're supposed to make a commitment of many tens of thousands of dollars* over two years to people you can't get straightforward and thorough details from about what their program has to offer?!
The Academic Departments know what's up
The best way to learn more about a program, to understand how it functions, and to get details on admissions etc. is to contact the Chairperson for the CS department (CC their assistant if they have one too). In general, they're very pragmatic- and will make considerations if you weren't a rockstar during undergrad but have had good working experience. Also, they're going to be the ones to eventually decide whether they want you in their program, so getting a relationship with them up front can't hurt.
Distance learning?
Since my local options had run dry, or were just painful to deal with, I decided to expand my search to universities that provided distance learning. Not surprisingly, any college that provides distance learning also does a pretty good job of communicating with potential applicants via email**. I landed up getting admitted into Hofstra after learning more about their CS program.
Not easy, but rewarding
About 90% of my courses were really good. In general, the amount gained from a course was proportional to the amount invested by the professor. Surprisingly, I think that all parties (educators and students) have to put in a lot more effort in a distance learning setting than they would in a traditional classroom setting. Consider that your average course taught over distance learning requires that the professor create a video, slides, notes and provide references to various supplementary materials as part of a single lecture. Instead of office hours, you have discussion boards where everyone participates and sometimes the professor throws down a specific discussion point.
The net result is that you have a much higher bandwidth with your professor- you're in email contact with them often. Frequently, professors will provide their IM handle so you can IM them anytime too. You get to watch the lectures on your time: evenings, weekends or whatever works.
Homework on the road |
Was it worth it?
Yes. Its been a really rewarding process. I've had a bunch of classes of which my favorites were:
- Algorithm design and analysis: going all the way down to fundamentals and approaching each data structure from an implementation standpoint and analyzing them for every possible operation. Order of magnitude is my best friend.
- Programming Language Concepts: All I'd had any real experience with were imperative languages, so this was a real mind expander. It exposed me to the fundamentals of functional languages from a theoretical standpoint and then I got to play with them too. (I had to write code in ML and see the most sensible compiler output ever). This class alone made the whole gig worth it.
- Security: which makes a lot more sense after you've had to deal with it in a corporate setting. I got to understand asymmetric key generation, encryption, and decryption by hand (and wolframalpha.com). Not to mention getting familiar with Kerberos and all sorts of other network based authentication systems that make you chuckle when you get a sales pitch about SSO from a vendor.
- Operating Systems: Based on Andrew Tanenbaum's Modern Operating Systems. Deep dives into processes and threads, memory (everything you ever wanted to know about managing memory, algorithms to do so, and tradeoffs), and file system design.
- Databases: Really understanding how they're built. This proved immensely valuable at work- I took all the theory and was able to apply it back in practice.
- Advanced Data Structures: Implement every data structure possible with a different one. Get familiar enough with Huffman to create tables and then compress boring documents during a meeting for fun, on paper.
Timing
I'm glad I waited to do my Master's after I had a bunch of work experience behind me. Encountering problems in real life (and sometimes solving them in a rube-goldberg fashion) allowed me to gain a lot more from school than I would have had I not encountered the problems before hand. I think you get a lot more 'ah-ha!' moments..
Consequences
My primary rationale for going to school was because I felt like I needed to be challenged with problems that I hadn't encountered before. This I got in spades.. And then then there was the character building from the volumes of homework, assignments, papers and such. But it also became glaringly apparent that this would be a finite engagement. Eventually you graduate (you hope!). I wanted to continue this education/masochism. To some degree then, this whole experience helped me realize that I needed to find a different job: where I would have to face a whole bunch of problems that would make me nervous and would have to learn a lot on the fly. And so, I started job hunting. But that's a different story..
And now, enough procrastinating and back to wrapping up my thesis so I can try and graduate...
Footnotes:
* Be prepared to invest between $20K to $40K for a good good distance learning program. Plan on getting a laptop that you'll use for the entire program too.
** Drexel and Hofstra were two that did particularly well.
*** The amount of energy you have to put in is a lot. Plan on getting hit for no less than 15 to 20 hours a week. Recovering once you fall behind is really hard, since the homework and discussion board posts all start to add up and create a huge backlog.
03 April 2011
27 March 2011
More juice
Recently, I discovered a problem with wordsinmedia.com.
The additional processing on the perl side is CPU intensive, and with more news to process more cpu was being burned. With more data from the perl side, the MySQL instance had changed its growth rate: queries that dealt with hundreds of rows earlier were now dealing with tens of thousands, causing an increased load on the database. Collectively, everything had added up nicely to swamp the whole system leaving the any queries from the website to become dog slow- rendering the website in a very non-responsive state. And yes, it all lives together- this was nothing more than an experiment that grew incrementally, so...
All of my hardware is virtually provisioned, and lives within a cloud. I'm biased toward a specific one, but anyway...
As a first step, I figured I should isolate the various parts to see if that helps things along- there was just too much CPU being contended on to adequately isolate components to make a deterministic call on what was going on. I figured I'd separate the perl processing from the database/web server first. Fairly simple to do:
Provision a new node
Extraordinarily easy, and in many cases, free if you want a small amount of horsepower. Get an OS booted up on it and call it good.
Addressing
Since there's going to be node to node addressing for the perl programs to talk to the database node, you need a way to maintain address lookups. In my case, I rely on Elastic IPs which while public visible also provide internal IPs when used within a security group.
Fortunately, I only needed to make one change: point the perl programs to the elastic IP instead of pointing to localhost.
That's it. Asynchronous news acquisition and analysis is on one node, while the database and web server are elsewhere. As evident, separating those two would be trivial too- just get another node, place the war in a web server there, futz with addressing and call it good. If it doesn't work, scrap it- you lost nothing other than the time it took to run your experiment.
There's no rocket science in any of this. But its heart warming that in reality it really only takes a couple of hours (to the un-initiated like me) to get this done. Contrast that with trying to do this if you had to work with your own hardware- you'd either have to buy some, or hope there's some lying around, or make a case with your hardware team. Then you'd have to hope that this pans out well- since if it doesn't you just sank your investment in hardware.
This is, admittedly an almost contrived example of why on-demand virtual provisioning is awesome. But I think I got lucky in that my components were so inherently separable. My initial tendency might have been to do something horrible like have the news acquisition/processing live within the scope of the same war that powered the web-end. One deployment/logs/build to worry about right?
I've been part of many decisions where I suggested or was persuaded to accept that it was ok to stuff yet another component into an already large ball of yarn. Invariably, all of these would get knit together and thus become one inseparable bundle of pain.
With virtualization being so easy and cheap, I wonder how much easier it might be to consider spinning up fresh instances for every new component you consider? Granted- its a pendulum swing, and might not always be appropriate. But, if you used that premise as a baseline assumption- how would that change the end quality of what you build, how it can scale, and how easy it is to maintain?
Week 11 was not fun. |
There are three main parts to this system:
- a database that stores stuff
- a set of perl programs that acquire and process the news and store them in the database
- and, a website that sits on top of the database whose backend executes within Jetty
The additional processing on the perl side is CPU intensive, and with more news to process more cpu was being burned. With more data from the perl side, the MySQL instance had changed its growth rate: queries that dealt with hundreds of rows earlier were now dealing with tens of thousands, causing an increased load on the database. Collectively, everything had added up nicely to swamp the whole system leaving the any queries from the website to become dog slow- rendering the website in a very non-responsive state. And yes, it all lives together- this was nothing more than an experiment that grew incrementally, so...
All of my hardware is virtually provisioned, and lives within a cloud. I'm biased toward a specific one, but anyway...
As a first step, I figured I should isolate the various parts to see if that helps things along- there was just too much CPU being contended on to adequately isolate components to make a deterministic call on what was going on. I figured I'd separate the perl processing from the database/web server first. Fairly simple to do:
Need more power? |
Provision a new node
Extraordinarily easy, and in many cases, free if you want a small amount of horsepower. Get an OS booted up on it and call it good.
Addressing
Since there's going to be node to node addressing for the perl programs to talk to the database node, you need a way to maintain address lookups. In my case, I rely on Elastic IPs which while public visible also provide internal IPs when used within a security group.
Fortunately, I only needed to make one change: point the perl programs to the elastic IP instead of pointing to localhost.
That's it. Asynchronous news acquisition and analysis is on one node, while the database and web server are elsewhere. As evident, separating those two would be trivial too- just get another node, place the war in a web server there, futz with addressing and call it good. If it doesn't work, scrap it- you lost nothing other than the time it took to run your experiment.
There's no rocket science in any of this. But its heart warming that in reality it really only takes a couple of hours (to the un-initiated like me) to get this done. Contrast that with trying to do this if you had to work with your own hardware- you'd either have to buy some, or hope there's some lying around, or make a case with your hardware team. Then you'd have to hope that this pans out well- since if it doesn't you just sank your investment in hardware.
This is, admittedly an almost contrived example of why on-demand virtual provisioning is awesome. But I think I got lucky in that my components were so inherently separable. My initial tendency might have been to do something horrible like have the news acquisition/processing live within the scope of the same war that powered the web-end. One deployment/logs/build to worry about right?
I've been part of many decisions where I suggested or was persuaded to accept that it was ok to stuff yet another component into an already large ball of yarn. Invariably, all of these would get knit together and thus become one inseparable bundle of pain.
With virtualization being so easy and cheap, I wonder how much easier it might be to consider spinning up fresh instances for every new component you consider? Granted- its a pendulum swing, and might not always be appropriate. But, if you used that premise as a baseline assumption- how would that change the end quality of what you build, how it can scale, and how easy it is to maintain?
23 March 2011
Subscribe to:
Posts (Atom)