ECS163 Homework and Projects


Misc. Notes

  • All submissions should be done via SmartSite.
  • Homework and projects are to be done solo, unless otherwise noted.
  • When formatting Word documents or PDFs for your write-ups, use standard formatting (i.e., fonts like Times New Roman, Arial, Cambria, etc. and 11 or 12 point sizing). You are allowed to double space if you want and are encouraged to stylize your document with features like figures, captions, and references, but don't use this as an excuse to skimp on the written content. Write-ups are a significant part of each assignment's grade; please ensure that your report's appearance, formatting, and writing is well done.
  • Up to 20% extra credit is possible for each assignment. Points are awarded for work that is particularly well done or offers interesting insight/analysis into the data.
  • If you have questions about an assignment, contact Chris (the TA) via email or attend office hours.

A Note on Datasets

Suggested datasets for Homework #2, Project #1, and Project #2 are taken from The Opportunity Project, which aggregates several government data sources from state and federal websites. Datasets vary in size and complexity, and may require clean-up before use (real data is often messy!). Be sure to factor these considerations into your work to make sure you have enough time to complete assignments.

You may elect to use your own dataset for these assignments, however it must be government-based: either one that is sourced from The Opportunity Project website or is from a different government data repository, such as the U.S. Census Bureau or Data.gov sites. If you want to use your own dataset for a project, get prior approval from Chris.

You are not allowed to use the same dataset more than once!

Homework 1: Find and present a visualization you like (5%)

Duration: 7 days
Assigned: January 10
Due: 11:55pm January 16 (electronic submission)
Presentation: January 26 in class

Description

For homework #1, your task is to find a visualization (or visual-based interface) that is "in the wild" somewhere out there on the internet (i.e., use a website instead of a textbook or technical paper). Find what you believe is a good or interesting example of visualization. Of course, since it's early in the quarter, and we haven't completely defined what makes a visualization “good,” we don’t expect you to find a flawless example, but do your best!

You will then create a HTML web page with the following components:

  • Header containing a title and your name.
  • Screenshot image of the visualization or web page.
  • Hyperlink to the source page that you got the visualization from.
  • A short, text description of the visualization.

Your text description should be 1-2 paragraphs, and answer the following:

  • What is being shown?
  • Is the visualization interactive or static? If interactive, how can it be interacted with?
  • What are some of the design choices and visualization techniques used?
    • These include things like choice of color or color palettes, the type of plot/chart used, the interaction techniques, etc.
  • What, in your opinion, is the rationale behind the design choices?
  • Do you think this is a good visualization?
    • Give a short, ~1 paragraph critique on why or why not.

Your HTML page does not have to be elaborately styled, but please make it clean-looking and succinct. An example web page is on SmartSite, but feel free to use your own design or CSS styling. Format and submit the following files via SmartSite:

  • The web page file, named index.html.
  • The visualization screenshot image, named cover.png
  • Any used css (or other) files to build your page

Some of you will be asked to present your selected visualization in class on 1/26 using your web page. Your presentation should be no longer than 1.5 minutes; be prepared to answer 1-2 questions about your selected visualization afterwards.


Homework 2: Use Tableau to Create a Visualization (10%)

Duration: 9 days
Assigned:January 17
Due: January 25 11:55pm (electronic submission)
Presentation: February 2 in class

Description

In this assignment, you will use a commercial visualization software (Tableau) to explore how datasets can be visualized, and then use it to create your own visualization of a dataset. The advantage of using an existing application is that you don't have to write code to create your charts or plots; this is easily performed within the application itself.

The first step in this assignment is to download and install Tableau Public (link) on your computer. (If you have trouble with this step, either use the online version or talk to Chris about it.) Select one of the datasets below (or get approval from Chris to your own) and explore it with Tableau. Play around with visualizing the data in different ways; you want to try to find a good way to either summarize the whole thing or focus on presenting a specific trend, pattern, feature, etc. within the data.

Create a single, static visualization (or chart, or infographic) that you will submit. You will turn in a Word document or PDF file, approximately 1 page in length. Take a screenshot of your visualization and put it at the top of the doc, then write a couple of paragraphs describing the following:

  • Which dataset you are showing (including where you got it from).
  • How are showing the data (i.e., what are your design choices?).
    • What visual technique are you using?
    • What are the marks and channels used?
    • What does color represent?
  • A brief rationale for why you are showing it the way you are.
    • Justify your design choices. For example, why you chose the visual technique that you did, etc.
  • Finally, state if you feel your visualization is successful at showing the data. If not, how could it be improved?

Grading Criteria (100 pts total)

  • 50 pts: The created visualization is well done and makes sense.
  • 50 pts: The write-up is well done and addresses the above questions.

Suggested Datasets


Some of you will be invited to present your visualizations in class on 2/2.

Project 1: Visual Data Storytelling (15%)

Duration: 14 days
Assigned: January 26
Milestone: February 1 11:55pm (electronic submission)
Due: February 8 11:55pm (electronic submission)

Description

Visualization lets people explore data for the purposes of discovery and analysis. In this first project, you will pose questions about a dataset and create static visualizations that answer them. You will log this exploratory process as a "data story."

The goal in this project is not to necessarily develop new or novel ways to plot data, but to get practice in using visual techniques to analyze, explore, and explain it. You will document this question → visualization → answer process by creating a single-page website. On it, you will note each question that you ask, show the visualization created to answer each question, and describe how the created visualization led you to the ensuing analysis that was conducted (that is, the next step performed, question asked, or conclusion).

One way to start this process is as follows:

  1. Pick a dataset that interests you.
  2. Pose an initial, overarching question to answer.
    • For example: Is there a relationship between attribute A and B in the dataset? What are the tendencies or trends of attribute C? Are data values of attribute D above a certain threshold correlated to some other type of data attribute? Is there a pattern to the data, if so, what causes it?
  3. Assess the fitness of your dataset.
    • In its initial form, the raw data might be appropriate to start answering your question, but it might not be! You might need to aggregate values together to sum up the raw data points, or apply some statistical metric to them, such as normalization or clustering, etc. Maybe the data should be cleaned or reformatted (i.e., raw values converted from strings to decimals) for easier parsing and display. Do these necessary steps here, producing a new text file (CSV, JSON, whatever format works for you) that arranges the data as you need it.
  4. Using this pruned data file, construct a D3 visualization that provides an answer to this initial question. Your visualization can be static (interaction is not required), but you are required to render it using D3 as part of your web page.

Upon creating your first chart, you may find your initial assumption changes or a new question emerges. Maybe the data shows a different behavior than expected? Or maybe you can now ask a new, more specific question that forces you to investigate a subset of the data. Perhaps the initial view shows data point outliers, or perhaps one dimension is obviously very important and should be further investigated.

Based on this new question, repeat steps 3 and 4 to create a second visualization that further analyzes and explores the data. For this second view, you should use a different visualization technique (i.e., if you first used a scatter plot, you cannot re-use it again). You might need to re-format or parse your data (creating a new data file) for this second view. This is an organic exploration process; you are trying to discover aspects, patterns, outliers, and/or trends of the data using visualization, so use your judgment and follow what interests you!

Iterate through this question-to-visualization creation process at least three times, asking at least 3-4 questions (and thus coming up with 3-4 visualizations). With this set of questions and created charts, come up with what you feel is a good data story that explains your analysis of the dataset. You will create a single web page to present your results. Organize it based on the sequence of questions that you ask, along with the associated D3 charts. Don’t use a screenshot of your created chart, it should actually be part of the page. For each visualization, include the following:

  • The question asked.
  • The created D3 chart.
  • A caption describing what the chart shows.
  • A short explanation of your analysis process for this step in the data story, such as your rationale for asking this question, why you chose this chart, if it gave you any insight into the dataset, and why you were prompted to ask your ensuing question (about 1-2 paragraphs).

Include a short introduction about your data story before the first visualization and a summary or conclusion after the last chart (be sure to note where your data set comes from here). At the top of the page, include a title. Remember that you are telling a narrative about your discovery process; you want to inform and explain your thought process, and hopefully tell something interesting about the dataset.

Milestone

The milestone ensures that you are progressing sufficiently on the project. For the milestone, you will make a SmartSite submission of your current project, along with a short readme.txt file describing what has been completed so far. Of course, your project won't be finished yet, but at a minimum, you should have the following completed:

  • Selected a dataset and posed an initial question about the data.
  • Created your first D3 chart for your data story web page.

Zip the necessary files together (your index.html page, the necessary javascript, CSS, etc. files, the formatted data files for your first D3 chart) together and submit them on SmartSite. The milestone is worth 10% of the project.

Completed Project

Your completed data story will be a single web page (HTML file) with the D3 charts and accompanying text. Submit all the required files to render the web page via SmartSite. These will include:

  • The web page's main HTML file. Name it index.html.
  • Any required javascript, CSS, etc., files.
  • The data files required for the D3 charts.
As noted prior, especially if your dataset is large, you probably will need to render aggregated or formatted views of the data. One advantage to doing this is that large files render slowly via javascript and D3, so your page may crash if your data file sizes are too large. You may have also trouble loading especially large files to SmartSite. If you break down a large dataset to smaller, more manageable files, you should submit only these with your project, there is no need to submit the original, large data files.

Grading Criteria (100 pts total)

  • 10 pts: Milestone is completed.
  • 30 pts: Exploratory Process
    • The questions asked are applicable to the chosen dataset.
    • Analysis performed on the data should have sufficient depth, not be trivial.
    • Follow-up questions drill deeper down into exploration of the data.
  • 40 pts: Created Visualizations
    • At least 3 different visualization techniques are used.
    • The design choices for the visualizations make sense.
      • Techniques used are appropriate for the data, color choices make sense, etc.
  • 20 pts: Web Page Design
    • The data story is contained in a single page.
    • The page contains the required necessary D3 charts, story text, and header/title.
    • Page design and styling is well done.

Extra credit consideration will be given for projects that answer additional questions, include interactivity, or perform interesting analysis into the data.

Suggested Datasets

Remember you can find and use your own dataset, but you must clear your choice with Chris beforehand. Your dataset must also come from the Opportunity Project or affiliated government website (such as data.gov, or census.gov, or the California open data portal).

Project 2: Interactive Visualization Interface (20%)

Duration: 19 days
Assigned: February 9
Milestone: 11:55pm February 17 (electronic submission)
Due: 11:55pm February 27 (electronic submission)

Description

In Project 1, you created a "visual data story" by implementing static charts that answered set of questions for a dataset. This project takes the next step, which is to create an interactive, visual analytics-based interface that allows user guided exploration and analysis.

Your interface will be D3-based and include both overview and detail views of the data, allowing for analysis at different levels of granularity. You are required to use at least 3 visualization techniques to display the data, include at least 1 type of level-of-detail interaction (i.e., zooming, highlighting, selecting a specific data point, etc.), and you must implement some type of filtering or group selection that can be applied to at least 1 of the views. In addition, your system's visualizations must be connected, that is, there should be a directed flow to the way you move between the different views; they shouldn't be independent of each other or standalone.

What visualization techniques you use to show the data and how you interact within the system is up to you, so think about what will be effective for your chosen dataset. For example, you might start off by showing a high-level, aggregated overview of the data. Selecting part of it will load a specific, detailed view, based on your selection. From here, you can visualize your detail view in different ways. Alternatively, you could apply a filter to this data selection to explore certain subsets of the data. Remember to apply principles from lectures and readings! Trying different techniques may lead to finding ones that work better!

In addition to creating the interface, this project contains an evaluation step. When your system is finished, coordinate with another student in the class. Demo your application for that student and let them play around with it. That student give feedback, things like what s/he likes about the system, what could be improved upon, and ways the system could be expanded in the future. You will include their qualitative commentary/critiquing in your final project write-up. (Note that you do not have to implement any their recommendations or fix their critiques, this is instead a way for you to practice getting feedback and to consider alternative or improved ways to implement future projects.)

Similar to Project 1, you may find it necessary to break your chosen dataset up into pieces, or aggregate it together into high-level views, especially if the chosen dataset is large! This will let your interface load faster and allow you to include more interaction with it, since you'll only have to reference the specific parts/granularities that you are currently viewing or interacting with.

Tasks and Grading Breakdown (100 pts)

The completed project will be an interactive system along with a PDF or Word document report. Your interactive system should do the following specific tasks:

  • 15 pts: Overview Visualization
    • Display an overview of the dataset that visualizes one or more major data dimensions.
  • 15 pts: Detailed View
    • Allow users to see a more detailed visualization of the data via interaction, such as filtering or highlighting the main overview, drilling down, or linking navigating to a different, connected view. This should show a subset, highlight, or dimensionally orthogonal view of the data, when compared to the main view, and must be a different visualization technique.
  • 15 pts: Third Visualization
    • A third visualization technique should be included to add additional context and detail. This view can either be linked to one of the above two visualizations, and can be independent of them and accessed via user interaction, selection, filtering, etc.
  • 10 pts: Interactive Filtering
    • Be able to interactively filter one of the data views based on some relevant criterion (i.e., geographic selection, time range, data value category or threshold, etc). This can either be part of your initial overview→detail view action, or an additional filtering step elsewhere in the system.

Include a screenshot of your system named cover.png. Name your starting web page to index.html. In addition, you will turn in a write-up of your project as a Word document or PDF. This is worth 35 pts, and should contain the following:

  • Title and name.
  • A brief description of the implemented views and interactions in your system.
    • What does each view show? How do you interact with the system? As a novice user, how would I use your system? Also state your justification or reasoning for choosing each view.
  • Note any insights that may be gleaned from your system that may not be intuitive from the dataset alone, and/or any interesting findings that you made.
  • Student Evaluation
    • Have a student in the class use your application and give you feedback on it. Provide a short write-up of this experience. If there are bugs, limitations, issues, or future improvements that can be made to your system, especially if mentioned by your student evaluator, briefly discuss these and if you feel they are warranted criticisms.
  • Extra credit justification, if you feel it is deserved.

In addition, the project milestone is worth 10 pts.

Milestone

Similar to Project #1, the milestone will gauge that you are progressing appropriately on the project. You will submit your current system files (html files, javascript, CSS, data files, etc.) along with a readme.txt noting the project's current status. For the milestone, you are expected to have picked out your dataset and implemented at least one D3 chart in your interface. Files will be submitted through SmartSite.

Suggested Datasets

See the list for Project #1. Remember you cannot repeat a dataset for multiple projects.


Final Project: Visualizing a Dataset of Your Choice (25%)

Duration: 21 days
Assigned: February 28
Proposal due: March 8 (electronic submissions)
Proposal presentation: March 9 in class
Final project and write-up due: 11:55pm March 20 (electronic submission)

Description

In this assignment, you will apply the data visualization and interaction ideas discussed in lectures, and the skills learned in the previous projects, to a dataset of your choice. You will present your ideas to your peers in a clear and concise manner, and you will demo your project for the instructors. Note that due to the size of the class, you are required to partner with other students to do this project. The team size should be between 3-4. Your final project will take the form of a fully fleshed-out system similar to Project #2, though since this is your second go-around in interactive system design and a team effort, we expect you collectively to do substantially better than what you did in your prior work. See below for a specific list of tasks your interface should accomplish.

Datasets

You are responsible for finding your own dataset this time. Pick something that you're curious about, and that you believe will produce interesting visualizations. The dataset should be rich, containing at least 4 related attributes. You are allowed to use a website from the Opportunity Project website, but may not select any of the prior suggested datasets.

Here are some ideas (although we encourage you to find your own!), along with possibly interesting attributes of each dataset:

  • Download your Facebook data (timeline posts, friends list, events, messages): link
  • Download a dump of Wikipedia (articles, images, links between articles, edit history, article discussion, user activity): link
  • Download one or more monthly dumps from Stack Exchange, a question-and-answer website (# of questions in each category over time, # of upvotes or downvotes per question or category, # of answers or comments per question or category, tag frequency, keyword frequency in questions and answers, user activity): link
  • Check out the various multivariate datasets available from UC Irvine's Machine Learning Repository. In particular, the cars dataset has been used in many high-dimensional visualizations, such as parallel coordinates.
  • Download a Twitter dataset. Here's a dump of tweets from 6/2009. It contains over 18 million tweets, about 990 MB compressed. The accompanying paper and the network of followers are available here.
  • A list of datasets mostly from visualization contests: link

Note: previous visualization work has been done on these datasets, and on similar types of data. You should strive to create a new, unique visualization, instead of merely reimplementing what others have already done. This is why it is important to describe your ideas well in the project proposal; we will be able to tell you if your ideas are too similar to previous work.

Proposal

The initial steps in this project is forming a team, selecting a dataset, and formulating a proposal. You will submit the proposal (Word document, ~2 pages, including images) describing the dataset you've chosen, and how you plan to visualize it. What interesting questions can you ask about the dataset? Based on these, you should list three specific tasks you think your system can perform for your dataset. For example, you might want to compare individual data values between each other, or efficiently filter a dataset based on some user input. How will you use visualization to answer them? Specifically, describe how you will use shape, color, size, connections, position, movement, and other visual channels to render different aspects of the data.

Also consider how interaction will play into your visual design. What visualization parameters will be under user control? How can you let your users easily navigate and sift though this dataset? What interaction is important and why?

The proposal will prepare you for the final project, in which you will implement your ideas. That being said, it is expected that your design will change somewhat during the course of implementation.

You must state how tasks are divided among project members.

Proposal Presentation (in-class)

You will give a short presentation (HTML or Powerpoint, ~2 minutes) summarizing the dataset and detailing your visual design and expected interaction. Your presentation should show pictures (hand drawn, computer-generated, etc.) which clearly convey your visual idea. Further, you must discuss your intended interaction and how your complete package will answer "interesting" questions.

You will present your design to the class. Your overall goal is to convince Dr. Ma, Chris, and the rest of the class, that your visualization is good for the dataset and the work allocations are reasonable.

Evaluation Step

Similar to Project #2, your team will perform an evaluation on your developed system. This time however, you will actually conduct a more formal, task-based evaluation by conducting a short user study. For your built system, take the three tasks from your proposal that you designed your system can accomplish. You will coordinate with another team and have its members (at least 3) separately try to perform these tasks using your system. Afterwards, each subject in your study (i.e., the other team members) will fill out a NASA-TLX for each task. Plot these results in your final write-up, and include any other qualitative commentary or feedback about your system from the study that you feel is pertinent.

Final Write-up

To describe and document the final design of your project, you will submit a final write-up (Word document, ~4-6 pages, including images). This write-up will be an extension of your initial project proposal; it can include the original proposal, to show what you initially set out to do, or it can further develop the proposal and add more detail. You will also describe your system evaluation here.

The final project write-up should include the following:

  • A description of the dataset and its relevance. That is, what you are trying to visualize, and why you want to visualize it.
  • How you initially proposed to visualize the data, and why you want to visualize it that way.
  • How the visualization system changed throughout the design / implementation process, such as things you tried that just didn't work.
  • Justifications and explanations for changes and your team's design decisions.
  • A description of the final visualization system, including implementation details, visual encodings, and interactions.
  • Your evaluation section.
  • How tasks were divided among team members.

Tasks and Grading Breakdown (100 pts)

You will implement the design that you proposed to me and presented to the class. Again, this system should enable us to answer questions that would require exhaustive effort if done with conventional spreadsheet browsing or database queries. This project has much room for creativity. Your objective is to impress me by making a highly interactive and visually appealing system that is useful and provides new insight in the data. The grading for this project is based on combination of proposal, your created interface, your final presentation (demo), and final write-up.

  • 10 pts: Initial Proposal Write-up
  • 10 pts: In-class Proposal Presentation
  • 40 pts: Interactive System
    • Includes both overview and detail view(s).
    • Must include at least 3 visualization techniques used in the system.
    • System visualizations and UI flow are connected (no independent system components).
    • Should display good design choices for user interaction and visualization desgin based on what you've learned in class.
    • Must include at least three of the following features:
      • Temporal, spatial, or data value filtering.
      • Data selection, either single point clicking or lassoing.
      • Include a view that incorporates data clustering, dimensionality reduction, splatting, PCA, etc.
      • Animation
      • Include a visual-based querying mechanism
      • Pop-up widget / tooltip visualization
      • Leverage an online API for data retrieval
      • Interactive re-mapping of color, shape, and/or size for selected attributes
      • Be able to save the state of the visualization and a sequence of user interactions, and be able to go back in time to a saved state (i.e., system provenance)
      • Introduce a novel visualization technique.
  • 10 pts: Final System Demo
  • 30 pts: Final Report