Web Content Analysis Syllabus

S642: Content Analysis for the World Wide Web



Fall 2009


Susan Herring


Thursday 5:45-8:30 p.m.


LI 037


LI 030


(812) 856-4919 (voice mail)




herring @ indiana.edu

Instructor's Office Hours: Th 4-5 p.m. and by appointment

Class listserv list: caweb-l @ indiana.edu

Required Readings:

Most of the readings for this course are available on the web (live links are included in this syllabus). The others will be on e-reserves or Oncourse.

1. Course Description

Content Analysis is an established social science methodology for analyzing meaning and structure in written documents; it can also be used to analyze images and sound. The World Wide Web is a multimodal, networked means of document delivery that is the most important source of content in the world today.

In this course, you will learn about and apply Content Analysis methods, both narrowly and broadly construed, to diverse types of content communicated through HTML documents on the web, including text and graphics, video, interactivity features, and links. The methods, which are both qualitative and quantitative, can be used to analyze genre characteristics, aesthetics, usability, "stickiness," credibility, persuasion, bias, and cultural differences associated with the presentation of information on the web. In addition, we will consider how Content Analysis can be adapted to analyze "Web 2.0" content, such as content collaboratively produced on wikis, social network sites, microblogging sites, and social bookmarking sites.

The course is structured around presentation of methods and hands-on web data analysis. Each student selects a website or sites for analysis, according to their interests. For example, students with interests in a particular content domain (e-commerce, online instruction, news, politics, health information, gender issues, etc.) or web genres (blogs, wikis, social network sites, online dating sites, music downloading sites, social bookmarking sites, etc.) may focus on them in their choice of data for analysis. After each method is presented in class through the readings and lectures, students apply it to their data. The students' findings are then shared with the class through oral presentations, and written up in short reports. At the end of the semester, students write an original research paper describing a web genre or other collection of sites of their choice. As relatively little research of this type has been carried out so far, it is likely that each student project will create new knowledge about the web. If it is well done, your research in this course can lead to opportunities for conference presentation and/or publication.

Students are expected to have experience accessing the World Wide Web, including using search engines such as Google. No previous knowledge of Content Analysis is required. Students do not create websites as part of this course; rather, the focus is on creating knowledge about the web through descriptive empirical research. This knowledge, in turn, may have implications for web design and/or content development that extend beyond the course.

2.   Course Objectives

o   To provide training in applying a set of empirical analytical methods to web content.

o   To instill understanding of Content Analysis principles that will enable you to design and carry out CA research, and ultimately to modify the methods to address questions and data of interest to you.

Specifically, as a result of completing this course, you should gain:

o   A critical perspective (in the positive sense) on the web as a communication medium.

o   Practical skills in applying and interpreting the results of Content Analysis methods.

o   The ability to design and carry out an original research project.

3.   Student Requirements

Readings: Students are expected to read the assigned readings before each scheduled class meeting.

Website analysis. Each student will select a website (or sites) for the purpose of analysis throughout the course. The sites should contain content that the student finds personally interesting and/or that relates to their professional goals. These data will be used to train the student in applying Content Analysis methods. They may also be used, supplemented with additional data, for the final research paper.

Reports. The results of applying the methods introduced in the course to the selected data will be presented in four oral and four written reports, where the written reports are on the same topics as the oral reports. The oral reports should be brief (6 minutes) and may be supported with simple PowerPoint displays and live internet demonstrations. (A good rule of thumb is one PowerPoint slide per minute of presentation time.) The written reports should record the findings presented in the oral reports, incorporating feedback from the class and the instructor, concisely and clearly (3-4 pages, excluding appendices). Guidelines for each report will be made available one week before the scheduled oral report presentation date.

Research paper. At the end of the semester, each student will write a 4500-7500 word research paper (excluding references and appendices) analyzing the content of a collection of websites defined by the student. This research may make use of the data already analyzed during the semester, or it may supplement or replace those data with new data (with the instructor's approval). However, it should NOT just be a complication of the written reports. A 500-word written proposal describing the web genre, sites to be analyzed, methods to be employed, and including a minimum of 3-5 references is due in the 11th week of the semester. In the last week of the semester, the results of each student's research will be presented to the class in a formal (conference-style) oral presentation (approx. 20 minutes, depending on how many students are enrolled in the course). The written paper should follow the formal conventions for a publishable-quality research article, including footnotes and citations of scholarly work in APA (American Psychological Association) style. (See course bibliography for examples of APA reference style.)

Listserv list.
There is a listserv list for this course. Students are expected to check their email at least twice between class meetings, including the afternoon before class for last-minute announcements and reminders.

4.   Grading

Your grade for the course will be calculated as follows:

Oral reports (4x4%)


Written reports (4x6%)


Oral presentation of term paper research


Term paper





Grading policy:

o     A late written report will be accepted once during the semester, no questions asked, provided it is turned in two days before the next class meeting, to allow me time to grade it. I reserve the right to subtract one-third of a letter grade (from A to A-, A- to B+, etc.) for each day a report is late beyond the due date or this one-time extension. This penalty also applies to the final paper.

o    Class participation means speaking in class in an informed way about the topics under discussion. A good rule of thumb is to try to speak at least twice in each class session. In order to be able to speak intelligently about a topic, you will need to have done the readings for that topic before class. You will also need to be physically present and attentive (e.g., NOT surfing the Web or reading email). Participation cannot be made up if you miss a class.

o    Oral reports will be graded with a check mark to indicate a satisfactory presentation. A satisfactory presentation is one that makes a good faith effort to address all the questions in the guidelines given in advance for each report, even if the report contains some errors. This method of grading is intended to encourage you to try to apply the methods, even if you feel somewhat uncertain how to do so.

o    Written reports, the oral presentation of your term paper research, and the written term paper will be assigned letter grades (A, A-, B+, B, B-, C+, C, etc.). A composite grade such as A-/B+ means that the grade is between an A- and a B+ (i.e., 89.5%). Grades in the 'A' range indicate outstanding work. Grades in the 'B' range indicate very good to good work. Grades in the 'C' range indicate average work, and a grade of 'D' or below is poor work.  Graduate students are expected to perform at a 'B' level or above.

o    Written reports should be concise (3-4 typed pages) and written in continuous prose (NOT outline style). Elaborate introductory and concluding paragraphs are unnecessary, but each report should begin with a statement of the topic that the report will address and should be sure to answer explicitly all questions asked in the guidelines for the report. DO include examples from your data and/or summary tables and graphs of your analytical results in your report, to support your claims. If including these supporting materials in the report would disrupt its flow, they may be appended to the report as an appendix. An 'A' quality written report is written clearly and concisely, answers all the questions asked, applies the methods correctly, and interprets the results plausibly and convincingly.

o     The oral presentation of the final research project will be graded primarily on form: how well it is organized, how informative it is, and how clearly and professionally it communicates to the audience (i.e., the rest of the class). An 'A' quality oral report conveys an appropriate amount of information given the time allotted for presentation, is presented in a straightforward and concise manner, and is logically organized (following the schema: identification and motivation of the choice of web data, brief background on the genre, data sampling, methods of analysis, findings, and some interpretation of the findings). Visual displays are strongly encouraged.

o     The final paper will be graded on content--motivation of the choice of web data, appropriateness of the data selection procedures, accuracy of the description and application of the methods, plausibility of the interpretations--and form--organization, clarity and quality of written expression, and appropriate use of scholarly conventions such as citations and footnotes. An 'A' quality term paper motivates the research topic, makes appropriate use of sampling and analytical techniques, and interprets the findings thoughtfully, in addition to being well-organized and clearly and professionally written. Some visual representations (e.g., screen shots) should be included of the content of the analyzed web sites.

Academic honesty:  Most of your activity in this course will involve producing original research. However, in writing about your research, and especially in your final paper, it may be necessary to reference previous work. As a rule of thumb, when in doubt, cite the source! In accordance with the policies of Indiana University, plagiarism, copyright infringement, and other types of academic dishonesty will not be tolerated.

5.   Course Schedule (subject to revision with advance warning)


Week 1 (9/3):       Introduction to Content Analysis. Selecting websites to analyze for this course.

Read:  Bauer, M. (2000). Classical content analysis: A review. In M. Bauer & G. Gaskell (eds.), Qualitative Researching with Text, Image and Sound (pp. 131-151). Thousand Oaks, CA: Sage. (on e-reserves)

Useful background reading on the history and technical aspects of the Web: Berners-Lee, T. (1996). The World Wide Web: Past, present and future. http://www.w3.org/People/Berners-Lee/1996/ppf.html


Week 2 (9/10):         Web archives. Methodological issues in analyzing the web.

            In class: Select and describe a website of the type you would like to analyze in this course.

Read:   Lyman, P., & Kahle, B. (1998). Archiving digital artifacts: Organizing an agenda for action. D-Lib Magazine. http://www.dlib.org/dlib/july98/07lyman.html

Schneider, S. M., & Foot, K. A. (2004). The web as an object of study. New Media & Society, 6 (1), 114-122. http://faculty.washington.edu/kfoot/Publications/Web-as-Object-of-Study.pdf

McMillan, S. J. (2000). The microscope and the moving target: The challenge of applying content analysis to the World Wide Web. Journalism and Mass Communication Quarterly, 77(1), 80-98. http://web.utk.edu/~sjmcmill/Research/research.htm

Herring, S. C. (In press). Web content analysis: Expanding the paradigm. In J. Hunsinger, M. Allen, & L. Klastrup (Eds.), The International Handbook of Internet Research. Springer Verlag. [Oncourse]


            Hands-on: Check out the history of 2 webpages on the Wayback Machine: http://archive.org. How has web design evolved in the past decade?


Week 3 (9/17):       Web genres and feature analysis.

            In preparation for the 1st report: Select 5-6 websites of the same genre

Read:  Crowston, K., & Williams, M. (2000). Reproduced and emergent genres of communication on the World-Wide Web. The Information Society,16(3), 201-216. [Oncourse]

Bates, M. J., & Lu, S. (1997). An exploratory profile of personal home pages: Content, design, metaphors. Online and CDROM Review, 21(6), 331-340. [ereserves]

Herring, S. C., Scheidt, L. A., Bonus, S., & Wright, E. (2004). Bridging the gap: A genre analysis of weblogs. Proceedings of the 37th Hawai'i International Conference on System Sciences (HICSS
-37). Los Alamitos: IEEE Computer Society Press. [Oncourse]


Week 4 (9/24):       Feature analysis (cont.). Interactivity and website credibility.

            1st oral report: Identify and analyze the frequency of the features in your 5-6 site sample that characterize that genre of web content

Read:   Ha, L., & James, E. L. (1998). Interactivity reexamined: A baseline analysis of early business web sites. Journal of Broadcasting and Electronic Media, 42(4), 457-474. [ereserves]

Chou, C. (2003). Interactivity and interactive functions in web-based learning systems: A technical framework for designers. British Journal of Educational Technology, 34(3), 265-279. [Oncourse]

Rains, S. A., & Karmikel, C. D. (2008). Health information-seeking and perceptions of website credibility: Examining Web-use orientation, message characteristics, and structural features of websites. Computers in Human Behavior, 25, 544-553. [Oncourse]

Look over: http://credibility.stanford.edu/guidelines/index.html


Week 5 (10/1):       Image analysis.

1st written report due: Feature analysis

Read:  Bell , P. (2001). Content analysis of visual images. In T. van Leeuwen & C. Jewitt (eds.), Handbook of Visual Analysis (pp. 10-34). London: Sage. [Oncourse]

Schwalbe, C. B. (2006). Remembering our shared past: Visually framing the Iraq war on U.S. news websites. Journal of Computer-Mediated Communication, 12 (1), article 14. http://jcmc.indiana.edu/vol12/issue1/schwalbe.html

van Leeuwen, T. (2001). Semiotics and iconography. In T. van Leeuwen & C. Jewitt (eds.), Handbook of Visual Analysis (pp. 92-118). London: Sage. [Oncourse]

Schmid-Isler, S. (2000). The language of digital genres. A semiotic investigation of style and iconology on the World Wide Web. Proceedings of the 33rd Hawaii International Conference on System Sciences. Los Alamitos: IEEE Press. [Oncourse] [short]


Week 6 (10/8):   Image analysis (cont.). Cultural differences. NO CLASS MEETING (Prof. Herring attending AoIR conference in Milwaukee).

 Discuss readings on Oncourse

Read:  Barber, W., & Badre, A. (1998). Culturability: The merging of culture and usability. Proceedings of the 4th Conference on Human Factors and the Web, June. http://research.microsoft.com/en-us/um/people/marycz/hfweb98/barber/

Callahan, E. (2005). Cultural similarities and differences in the design of university websites. Journal of Computer-Mediated Communication, 11(1), article 12. http://jcmc.indiana.e du/vol11/issue1/callahan.html

Würtz, E. (2005). A cross-cultural analysis of websites from high-context cultures and low-context cultures. Journal of Computer-Mediated Communication, 11(1), article 13. http://jcmc.indiana.edu /vol11/issue1/wuertz.html


Week 7 (10/15):       Video analysis.

2nd oral report: Visual CA and semiotic/iconographic analysis of images on five sites from one genre

Read:  Freeman, B., & Chapman, S. (2007). Is YouTube telling or selling you something? Tobacco content on the YouTube video-sharing website. Tobacco Control, 16, 207-210. [Oncourse]

Hesse-Biber, S., Dupuis, P. R., & Kinder, T. S. (1997). New developments in video ethnography and visual sociology—Analyzing multimedia data qualitatively. Social Science Computer Review, 15, (1), 5-12. [Oncourse]

Evans, W. (2000). Teaching computers to watch television: Content-based image retrieval for content analysis. Social Science Computer Review, 18, (3), 246-257. [Oncourse]

Week 8 (10/22):     Theme analysis.

  2nd written report due: Image analysis

Read:  Rosson, M. (1999). I get by with a little help from my cyber-friends: Sharing stories of good and bad times on the Web. Journal of Computer-Mediated Communication, 4(4). http://jcmc.indiana.edu/vol4/issue4/rosson.html

Shuyler, K. S., & Knight, K. M. (2003). What are patients seeking when they turn to the Internet? Qualitative content analysis of questions asked by visitors to an orthopaedics Web site. Journal of Medical Internet Research, 5(4), e24. [Oncourse]

Dimitrova, D. V., Kaid, L. L., Williams, A., & Trammell, K. D. (2005). War on the Web: The immediate news framing of Gulf War II. The Harvard International Journal of Press/Politics, 10(1), 22-44. [Oncourse]


Week 9 (10/29):     Language analysis (computerized text analysis).

Read:  Lowe, W. (2002). Software for content analysis - A review. [Oncourse]

Cohn, M. A., Mehl, M. R., & Pennebaker, J. W. (2004). Linguistic indicators of psychological change after September 11, 2001. Psychological Science, 15, 687-693. http://dingo.sbs.arizona.edu/ ~mehl/eReprints/Sept%2011%20Livejournal.pdf

Huffaker, D. A., & Calvert, S. L. (2005). Gender, identity, and language use in teenage blogs. Journal of Computer-Mediated Communication, 10(2), article 1. http://jcmc.indiana.edu/vol10/issue2/huffaker.html


Week 10 (11/5):     Link analysis.

             3rd oral report: Theme analysis

Read:  Foot, K. A., Schneider, S. M., Dougherty, M., Xenos, M., & Larsen, E. (2003). Analyzing linking practices: Candidate sites in the 2002 U.S. electoral web sphere. Journal of Computer-Mediated Communication, 8 (4). http://jcmc.indiana.edu /vol8/issue4/foot.html

Park, H. W., & Thelwall, M. (2003). Hyperlink analyses of the World Wide Web: A review. Journal of Computer-Mediated Communication, 8 (4). http://jcmc.indiana.edu /vol8/issue4/park.html


Week 11 (11/12):      Link analysis (cont.). Social network analysis.

            3rd written report due: Theme analysis

Read:  Adamic, L. A., Buyukkokten, O., & Adar, E. (2003). A social network caught in the web. First Monday, 8(6). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1057/977

Herring, S. C., Kouper, I., Paolillo, J. C., Scheidt, L. A., Tyworth, M., Welsch, P., Wright, E., & Yu, N. (2005). Conversations in the blogosphere: An analysis "from the bottom up." Proceedings of the Thirty-Eighth Hawai'i International Conference on System Sciences. Los Alamitos: IEEE Press. [Oncourse]

            500-word description of final research project due (see under Student Requirements at beginning of syllabus)


Week 12 (11/19):     The challenges of Web 2.0: Wikis.

            4th oral report: Link analysis

Read:  O'Reilly, T. (2005). What is Web 2.0? Design patterns and business models for the next generation of software. http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web- 20.html

Kittur, A., Chi, E. H., Pendleton, B. A., Suh, B., & Mytkowicz, T. (2007). Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie. 25th Annual ACM Conference on Human Factors in Computing Systems (CHI 2007); April 28-May 3, San Jose, CA. [Oncourse]

Pfeil, U., Zaphiris, P., & Ang, C. S. (2006). Cultural differences in collaborative authoring of Wikipedia. Journal of Computer-Mediated Communication, 12(1), article 5. http://jcmc.indiana.edu/vol12/issue1/pfeil.html




Week 13 (12/3):   The challenges of Web 2.0 (cont.): Social bookmarking sites.

            4th written report due: Link analysis

Read:  Hammond, T., Hannay, T., Lund, B., & Scott, J. (2005). Social bookmarking tools (I): A general review. D-Lib Magazine, 11(4). http://www.dlib.org//dlib/april05/hammond/04hammond.html

Golder, S., & Huberman, B. (2006). The structure of collaborative tagging systems. Journal of Information Science, 32 (2), 198-208. [Oncourse]

Yew, J., J., Gibson, F. & Teasley, S. (2006). Learning by tagging: The role of social tagging in group knowledge formation. MERLOT Journal of Online Learning and Teaching, 2 (4), December. http://jolt.merlot.org/vol2no4/yew.htm


Week 14 (12/10):  
Oral presentations of term paper research.  

Week 15 (12/15):   Written term paper due by 6 p.m., TUESDAY, December 15th.


Last updated September 2, 2009