B669: Personalized Data Mining and Mapping


The first four assignments are due weekly on mondays in class (and on the web). The order you do them in is up to you, as is the group you belong to, as long as you finish the ith assignment by the (i+1)th monday of the term. If you're having problems finding partners to work with please send me email.

Assignment 1

Create a directory named with your userid inside /l/knownspace/webpages and inside it create a world-readable webpage named "games.html" describing why computer games are more fun to use than most other software. Give as many specific examples as you can from games and other software you know well. Draw general conclusions from those examples.

You should also gather and present information from books and articles and from web sources (start with HotBot, AltaVista, Yahoo!, Excite, DejaNews, Infoseek, and Lycos). Be sure to include descriptions of any links to useful sites you find. Don't forget to run your webpage through a validator to check for dead links, mismatched tags, and so on. If you haven't used a validator before, try NetMechanic or Website Garage. (Note: you do not need to register to use these services; they are free.)

As with all webpages you will be creating this term, your page will be put in a gallery of all the pages submitted. For this first assignment you must work alone. In this assignment, you are learning about html, css, webpage design, and the web; sharpening your writing skills; and thinking about how to make information management an enjoyable, perhaps even joyous, experience.

Assignment 2

Choose at least four of the 98 topics listed at the top of the Linkage page and create a world-readable webpage in your webpages directory for those topics suitable for later inclusion in the website. You may also choose any other topic not listed there that you can show is relevant. Post your choices to the seminar newsgroup (ac.csci.b669) as soon as possible so that others do not choose the same topics.

Depending on the number of links already in each of your topic areas, either add at least 10 new links to each chosen topic and explain why each link is relevant, or organize and annotate the links that are already there. Organize and describe each of your topics so that someone new to the topic gets a general orientation and description of each link in relation to the other links in that topic and to information management as a whole.

Each of your finished topics should look something like the dhtml entry on the Linkage page. Pay careful attention to the internal formatting of your webpage---it should be clean, easy to edit, and well-organized (lines under 80 characters long, tabs for scoping, and so on)---use the internal formatting of the Linkage page itself as your guide.

If possible, try to record your frustrations with doing directed searches on the web: too many results, too few results, results off-topic, results poorly organized, no spatial arrangement of results, no memory of previous queries for context, varying results from different search engines, broken links, black holes, and so on. All of these issues should be addressed by any information manager.

For this second assignment you may work in groups; any such group, however, should be no larger than four and each group member must still cover at least four topics. In this assignment, you are learning more about the information manager project; learning why it's important (by having to do directed searches for yourself); beginning to work in groups; and functioning as ferrets, filters, and mapmakers---which should help you think about how to do those tasks with a program.

Assignment 3

Write a Java application to read an arbitrary directory structure and display it in some reasonable way on the screen. Your program should attempt to layout the pages and directories in two or more dimensions and, preferably, should use icons or animations or moveable text or some other means to present the information rather than simple static text labels. Your program should work whether the underlying machine is running a windows 95/98/nt4, mac os, unix, or solaris operating system and it should not place arbitrary limits on the number of pages and directories to be displayed. You may assume that the screen is at least 1000x1000 pixels.

For this assignment you must work in groups, but they may be as large as you like. Further, you are encouraged to surf the web looking for suitable source code to build on. In this assignment, you are increasing your Java skills; discovering how to program in groups; producing a program that might be the first step toward a decent information manager interface; and grappling with some of the issues involved in designing an interface more sophisticated than today's desktop.

Assignment 4

Build an html parser in Java by building on the supplied code to extract delimited strings rather than the traditional (but unmaintainable) huge switch statement. The aim here is not just to build a parser but, more importantly, to build one that is well-coded, well-documented, and easy to change as html changes. You must also build a webpage describing the program at various levels of detail: first at the level of the casual reader, then at the level of the interested programmer, and finally at the level of the program maintainer or modifier. Try to foresee all questions any such person may have. Be exhaustive.

For this assignment you must work in groups, and they may be as large as you like, however more than three or four people have a hard time meeting at the same time, never mind agreeing on anything. Further, you are encouraged to surf the web looking for code to build on; there are several html parsers available on the net---the problem is that few of them are well-written. In this assignment, you are learning how to produce production-quality Java code and producing a program that will likely be used to parse pages.

Alternative Assignment 4s:

Instead of writing a parser, you may prefer dabbling in AI or playing with ALife. Here is a link to the AI version of assignment 4. Here is a link to the ALife version of assignment 4.

Note: The networks team (plus Dan, if he's so inclined) doesn't have a choice---they have to do the AI version of the assignment and get it to be as advanced as possible. The interface team (minus Baekjun, if he's so inclined) also doesn't have a choice---they have to do the ALife version and perhaps get as much of the scripting done as possible. Everyone else is free to do whatever they want.

Have fun!

Codesmithing Assignment

List all the things that are wrong with the following package. Focus on flexibility (ease of reconfiguring the code), readability (ease of understanding the code), and robustness (resistance to bugs).