Tag Archives: tcs

Summer undergraduate research projects in theory

“In theory” as in “in theoretical computer science”.

I am lucky to have a student through the CRA-W Distributed Research Experiences for Undergraduates program. Anna Harutyunyan joins me for 10 weeks from Utah State University. I think it might be more of a learning experience for me than Anna (although my opinion is biased) and I appreciate Anna’s patience through my own growing pains as an advisor. Hopefully there haven’t been too many pains.

Anna is working on a generalization of the string alignment problem. I have an idea for an algorithm, and I have an idea of how one might analyze that algorithm, but it uses tools in which I am not so well versed. In addition to reading up on these tools, Anna has implemented the algorithm. This is not something I am in the habit of doing, but it is very satisfying to see an algorithm “work” when you are stuck on how to analyze it.

That said, my expectations for “proving something” with Anna are low – how does one prove something in 10 weeks? With a new project, I feel that the chance of proving something in such a short amount of time is next to impossible. With a project well underway, there is a much better chance, but there is a lot of start-up time involved in learning the state of affairs. So I’m torn as to whether to start a new project with a summer student or include them on parts of an existing project. The former must give the student a stronger sense of ownership over the work; the latter a better chance for the feeling of accomplishment.

Has anyone out there had luck or have advice on picking theory topics for research projects?

While my main goal is for Anna to have a positive experience this summer, at the very least I am having a wonderful time. Anna has had some wonderful ideas that I know would not have dawned on me – it’s exciting! I can’t wait to exploit educate more young minds.

Journals ranked by turnover times: now with colour!

Based on David’s link to the AMS data on journal backlogs in my last post (thanks Dave!) and the ISI Web of Knowledge citation report, I’ve wasted some time making the following fancy graph. There are some obvious missing journals that I didn’t have the data for: Theory of Computing (no impact factor), JACM (no backlog times), etc. If you have this data, I would be happy to add them.

Right now, the plot shows time from submit to accept against the impact factor (IF) with journal’s coloured by publisher and size indicating their volume by number of articles. All data is for 2008. It’s interactive! Switch to the 5-year impact factor! Fun!

So, now, I know that impact factors have little meaning in our field. I’d be happy to switch to some other more meaningful ranking. Feel free to comment your suggestions.

But what do you think: would you actually not submit to the SIAM Journal on Discrete Math based on this?

[iframe http://oj0ijfii34kccq3ioto7mdspc7r2s7o9.spreadsheets.gmodules.com/gadgets/ifr?up__table_query_url=http%3A%2F%2Fspreadsheets.google.com%2Ftq%3Frange%3DB2%253AI13%26headers%3D1%26key%3DttBltOeX1ZK-992JA1GeOiA%26gid%3D4%26pub%3D1&up_title=Journal+wait+times&up_initialstate=%7B%22duration%22%3A%7B%22timeUnit%22%3A%22Y%22%2C%22multiplier%22%3A1%7D%2C%22nonSelectedAlpha%22%3A0.4%2C%22yZoomedDataMin%22%3A6%2C%22yZoomedDataMax%22%3A17.7%2C%22iconKeySettings%22%3A%5B%5D%2C%22yZoomedIn%22%3Afalse%2C%22xZoomedDataMin%22%3A0.421%2C%22xLambda%22%3A1%2C%22time%22%3A%222008%22%2C%22orderedByX%22%3Afalse%2C%22xZoomedIn%22%3Afalse%2C%22uniColorForNonSelected%22%3Afalse%2C%22sizeOption%22%3A%227%22%2C%22iconType%22%3A%22BUBBLE%22%2C%22playDuration%22%3A15000%2C%22dimensions%22%3A%7B%22iconDimensions%22%3A%5B%22dim0%22%5D%7D%2C%22xZoomedDataMax%22%3A2.336%2C%22yLambda%22%3A1%2C%22yAxisOption%22%3A%223%22%2C%22colorOption%22%3A%222%22%2C%22showTrails%22%3Atrue%2C%22xAxisOption%22%3A%225%22%2C%22orderedByY%22%3Afalse%7D&up__table_query_refresh_interval=300&url=http%3A%2F%2Fwww.google.com%2Fig%2Fmodules%2Fmotionchart.xml&mid=4&nocache=1&synd=spreadsheets 550 450]

(The above works for me on Safari; I’m not sure how the gadget will work under other browsers. If you can’t see the embedded gadget, try this published spreadsheet.)

Update: I forgot to “give props” to Hans Rosling and GapMinder.org for popularizing these graphs. The graph was created in Google Spreadsheets using the “motion graph” gadget.

Update: JACM added thanks to Dave pointing me to JACM’s self-reported backlog. It is also nicely consistent with the impact factor/wait time correlation. I’d like to comment more on this in a later post: I don’t think this happens in other fields.

Journals ranked by turnover times?

I had a search of the blogs and web at large to see if there was any evidence (anecdotal or otherwise) about the turnover rates for TCS (and friendly) journals. Short answer: I couldn’t find much. I would (and I am sure many other people would) appreciate any help in deciding what journal to submit to if you are particularly in favour of short turnaround times. Of course, I am sure most would also not like to sacrifice quality – at least not too much. Thanks in advance!

Adaptive analysis

Jérémy Barbay was visiting me this week from Universidad de Chile. Although we overlapped at Waterloo by a few months, we had never talked in depth about research before. His visit was great timing to scoop me out of some research doldrums after a stressful winter quarter. He gave a great talk on adaptive analysis.

As we all know, algorithms often out-perform their worst-case analysis. There are a few theoretical tools for explaining this behaviour: think O(n log n) average case v. O(n^2) worst case for quick sort and poly-time smoothed analysis v. exponential-time worst case for the simplex algorithm. In adaptive analysis, we analyze an algorithm in terms of some (problem-dependent) hardness parameter, h. As an introductory example, consider insertion sort. If the input array is sorted, insertion sort takes O(n) time. If the input array is in reverse order, insertion sort takes O(n^2) time. If there are h pairs of elements that are mutually inverted, then insertion sort takes O(n+h) time: the running time depends on how hard a particular input instance is to sort.

Adaptive analysis has appeared several times in the past, but the word adaptive might not have been used. Jérémy would be a better person to provide a list. The most common examples seem to involve the analysis of existing algorithms. I would be most interested in the lens of adaptivity informing new algorithmic ideas, particularly those that would outperform existing algorithms, at least on interesting or useful inputs. Is there a collection of examples of such results? I know Jérémy mentioned one or two, but I’ve since forgotten in the whirlwind of whiteboard whimsy.

SODA 2012 to be in Kyoto, Japan

I missed the business meeting to have dinner with a non-SODA-attending friend and so missed the voting over the location of SODA 2012 which was apparently a close tie.

I’m a little dismayed at SODA being outside of North America. As a graduate student I would have probably been excited in my responsibility-free state. But now I’m thinking “How much is this going to cost? How can I afford to miss what will probably end up being a full week of teaching? I’m going to go all that way to just go to the conference and not be able to travel? How are our grossly underfunded faculty and grad students going to afford to go? Would I justify going if I don’t have a paper?”

SODA is my favourite conference. And there’s no other conference like it in North America. Going without it for a year would result in some withdrawal.

SODA 20 minute talks

Many people have been blogging on the technical content at SODA, but I won’t. Given that David has already hinted that I only value the first 10 minutes of most talks, clearly I’m not in the position to expound on the more than the definition of problems and all but the highest level of analysis.

I’ve been thinking about what I like about conferences. Of course I appreciate meeting wih friends and colleagues – working on new and old problems. I do enjoy the talks too. But for me, the 20 talk is problematic. I can only imagine two possible uses of 20 minutes: an advertisement to go read the paper, to educate people of the definition of the problem/topic/solution statement, or to actually go into technical details.

For topics that are directly in my area, 20 minutes are too short to delve into any technical details for which I would have questions. Nor do I need an advertisement. I am probably already aware of the paper (thanks archiv and its users) and perhaps already read the paper.

For topics not in my area 20 minutes is probably too long for an advertisment and too short for me to absorb definitions in order to appreciate any technical content.

That said, I miss theory seminars. I am the only traditional TCS person at OSU and am too far from theory strongholds to attend a theory seminar. I would love to get that content from a conference. The plenary talks provide a little of that, but they are not usually on recent results of a technical nature (nor would I want that to change).

What I propose is having two types of talks – short 10-15 minute “advertisements” and long 45-60 minute seminar style talks. The committee could choose the best results to give longer slots to. Perhaps (and probably controversially) longer slots could be biased towards better speakers.

Donation price of anarchy

I recently went to a Christmas party where, instead of a gift exchange, there was a donation exchange. Essentially, we each placed a cause’s name into a hat, people draw the names and are asked to donate to the cause. You may donate any amount you wish (including nothing if you are particularly opposed to the cause you drew). Given that this a group of people that have collectively decided to opt for altruism, the honour system should work. As a result, I will be donating to the World Food Program and someone will be donating to Planned Parenthood on my behalf.

Someone at the party suggested that next year they hold a Yankee swap version where, rather than simply draw and donate, people may later “steal” causes by agreeing to donate more than the current donor. However, I thought this might be unfair to those attending who happen to be unemployed or wracking up student debt. I was wondering if there is an algorithmic-game-theory person out there who could come up with a way to deal with this that might meet the following conditions:

the total amount donated is maximized (or at least the price of anarchy is bounded)
each person ends up matched to a cause (that is not their own)
each person can cap their donation according to their means
one’s cap does not hinder the ability to steal a cause
the game doesn’t take forever and the rules are simple enough for a smart crowd to understand

I suppose one could hide everything and have causes bid on like Google AdWords, but I think a game of stealing in the spirit of Christmas would be more fun.

nth Combinatorial Potlatch

The Combinatorial Potlatch is a semi-regular (which for last 7 years has been yearly!) one-day workshop in combinatorics held in Cascadia. It is very informal (no name tags!), very relaxed (only three talks!) and runs on next to no funding*. The latest installment was this past weekend in Vancouver, BC, held at Simon Fraser University’s downtown campus.

Participants at 2009 Combinatorial Potlatch

I gave a version of my talk on constrained knapsack problems (joint work with Brent Heeringa and Gordon Wilfong). It was a lot of fun! The discrete math crowd was fun and patiently sat through my discussions of applications and algorithms and approximations until I finally got to the meat of the talk. I don’t normally attend discrete math events, but this was a great way to meet people in the area who are graph-minded that I otherwise might not meet. I also hope that all their best undergraduates will be pointed my way for grad school (hint hint hint).

Louis Deaett (University of Victoria) gave a talk on a (orthogonal) generalization of graph colouring to vector colours where one must assign linearly independent vectors to adjacent vertices while minimizing the dimension of the vectors. This is certainly not something I had ever dreamt of before. Only after having let the problem stew for a couple of days am I wondering if a notion can be (or already has been) used in the frequency assignment problem. Rather than a node transmitting over one frequency, transmit over several; use independence to overcome interference.

Omer Angel (University of British Columbia) spoke on graphs that look the same everywhere from a local perspective. Given a local pattern centred at a vertex, what kind of graph is such that every vertex has the same local pattern? Can the graph be finite? Must it be infinite? For example, if the local pattern is a degree-2 star, then the graph could be a cycle or an infinite path – there is no way of telling which it is. Certainly, I thought, you could never tell if it is finite or infinite. Not true.

So, thank you Nancy Ann Neudauer for inviting me, Luis Goddyn for arranging the superb location, and Rob Beezer for quickly correcting that I am a proud beaver, not a duck.

* The host institution provides a room and math-fuel (coffee).

Postdoc after postdoc after postdoc?

There’s talk of postdocking* in the air – for one, Jonathan Katz posted about how to better match recent grads to postdoc positions. It looks like this year’s academic-job market is even worse than last and that postdocs might just fill in the gap for a year or two for some people – including those that are currently postdocking. Hearing such things make me cringe, but not because I think postdocs shouldn’t exist. I am very thankful for my 20 months spent as a postdoc. I don’t think I became a stronger job applicant in that time, but I do think that I became more confident in that time.

In the agonizing months** between interview and job offer at Oregon State University, I gave a lot of thought to “what do I do if I don’t get an academic job?” I had the option of staying on as a postdoc through summer 2010 – an option that made me cringe. “If I stay as a postdoc and next year’s market is terrible and then take another postdoc … where does the cycle end?”

I have many friends in the biosciences where two 3+ year postdocs is the norm. One has started a blog devoted to advocacy for postdocs; a recent post encourages the cycle of postdocing to end. I worry that CS could “get worse” and end up like bio. I hope that the competition offered by industry will help keep the postdocking length down. But Ph.D. enrollment is going up – where are these students supposed to go? Does anyone know if there are stats on the average postdoc length in computer science?

* I officially propose postdocking as the verbal of postdoc much like trafficking to traffic.
** Days became months due to budget hoop-jumping.

Job talks

I recently found out that when I gave my job talk at Oregon State University last year, I was being recorded. I was hesitant to post it, but I hope that, despite this far-from-perfect performance, it might be useful to those on the job market this year. Note that Oregon State is not a theory school. I was talking to an audience of grad students and faculty, none of whom (except one) work in algorithms. If I was giving a talk at a theory powerhouse, I probably would have targeted differently.

I broke a lot of standard rules in giving this job talk. First and foremost, I did not practice it. *gasp* Practice would have removed a lot of my “um”s and “uh”s. In my defence, when I practice a talk too much, I find it gets stale. However, practicing it once from start-to-finish would have been a good idea. In watching this talk (as painful as it is), I think the best thing I could have done was to tape myself once.

Second, I climbed on a chair. I was offered a laser pointer, but I hate laser pointers. They are hard to keep steady and the point is very small and hard to see for the audience. I find it about as useful as the speaker pointing to their laptop screen while they give a presentation. So, at some point I wanted to point at something that too high for me, so I climbed on a chair.

Another minor thing that I wish I would get in the habit of doing is repeating an asked question. Taking two seconds to summarize the question both confirms that you are answering the intended question and allows the entire audience to hear both the question and the answer.

The slides for the talk are available for Keynote and Powerpoint here.

Glencora Borradaile

Professor, School of Electrical Engineering and Computer Science, Oregon State University