Skip to content

CKIDS DataFest Spring 2020 Project Descriptions

In Spring 2020, CKIDS will host several data science projects in collaboration with the GRIDS data science student association.  These projects were proposed by USC faculty and researchers through an open call for project proposals.  Below is a short overview of all twenty projects.

Selected DataFest Spring 2020 Projects

 

1. A Data Challenge for Parkinson’s Disease

This project will assemble a team to participate in the Biomarker and Endpoint Assessment to Track Parkinson’s Disease (BEAT-PD) DREAM Challenge by the Michael J. Fox Foundation and Sage Bionetworks.  The challenge is designed to benchmark new methods to predict Parkinson’s disease progression. Teams participating in the Challenge will have access to raw sensor data that can be used to predict individual medication state and symptom severity. Specifically, teams are asked to develop methods to predict on/off medication status, dyskinesia severity, and/or tremor severity.

POINTERS: https://www.michaeljfox.org/news/mjff-and-sage-bionetworks-launch-data-challenge.

SKILLS NEEDED: advanced data science skills.

WHAT STUDENTS WILL LEARN: to do data science with real-world medical data.

ADVISORS:

  • Neda Jahanshad, Keck School of Medicine

STUDENT PARTICIPANTS:

  • Nazgol Tavabi, Ph.D. student in Computer Science, Viterbi School of Engineering
  • Che-Pai Kung, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Abhivineet Veeraghanta, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Likitha Lakshminarayanan, M.Sc. student in Electrical Engineering, Viterbi School of Engineering

2. Using Biomedical Researcher Judgments to Predict Clinical Trial Outcomes

Human patients should only be assigned to experimental medical treatments when investigators are truly uncertain about the novel treatment’s clinical utility. As such, the outcomes of clinical trials are difficult to predict by design. The goal of this project is to work toward building a predictive model of clinical trials. The first step is to categorize treatments based on their history and diseases based on their treatability using FDA records among other data sources. In collaboration with the Biomedical Ethics Unit at McGill University, we have collected many probability predictions about scientific and operational outcomes of newly registered clinical trials. When pre-processing is completed, we will begin building a model to predict the judgments of medical experts based on several trial and researcher characteristics. This model can be used to assess whether medical researchers are biased in their judgments about their own trials. Finally, we aim to assemble these components to develop a model to predict the outcomes of the clinical trials by accounting for the history of the treatment, treatability of the disease, and judgments of medical research accounting for revealed biases.

POINTERS: The data is not publicly available. May require IRB approval to add programmer prior to working with expert forecast data.
Some additional information can be found at: http://www.translationalethics.com/projects/forecast-study/.  Other relevant sources include: https://clinicaltrials.gov/, https://www.fda.gov/.

SKILLS NEEDED: Semi-automated data collection, Classification, Predictive modeling

WHAT STUDENTS WILL LEARN: Students will learn how to organize and classify data from external sources and prepare them as input into a predictive model.

ADVISORS:

  • Daniel Benjamin, Viterbi School of Engineering

3. Towards Automated Understanding of Scientific Software

Data science projects require knowledge of software that changes rapidly. As a result, scientists spend hours reading long documentations and manuals instead of advancing their scientific fields. In this project, we aim to automatically extract relevant aspects of scientific software (e.g., what does it do, how to install it, how to operate with it or how to cite it) from documentation and code using machine learning techniques. The students will build on an existing baseline of classifiers and try to improve the existing results.

POINTERS: Github repository with existing corpus and classifiers: https://github.com/KnowledgeCaptureAndDiscovery/SM2KG
See https://github.com/KnowledgeCaptureAndDiscovery/SM2KG/tree/master/documentation for documentation and paper.

SKILLS NEEDED: Python, machine learning/sklearn, knowledge graphs (optional)

WHAT WILL STUDENTS LEARN: Build and extend a corpus to solve a data science problem (in this case, classification of software metadata); Think about different ways to organize the data to solve the problem; Train classifiers to produce alternative results; Build a testbed for comparing results in a fair manner and without overfitting; Generate explanations for best results.  The students will also have to learn to post-process their results in order to incorporate them to an application developed by others.

ADVISOR:

  • Daniel Garijo, Viterbi School of Engineering

STUDENT PARTICIPANTS:

  • Haripriya Dharmala, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Jiaying Wang, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Vedant Diwanji, M.Sc. student in Computer Science, Viterbi School of Engineering

4. A Knowledge Graph for Cybersecurity Experiments

The DETER cybersecurity testbed has been running experiments for several years, collecting information about intrusions, vulnerabilities, and mitigation strategies.  This project will capture cybersecurity experiments as a knowledge graph that can be browsed, queried, and mined to find patterns and create models of cyberattacks.

POINTERS: https://deter-project.org.

SKILLS NEEDED: Knowledge graphs, python.

WHAT STUDENTS WILL LEARN: Representation of scientific experiments, knowledge graph creation and publication.

ADVISORS:

  • Daniel Garijo, Viterbi School of Engineering
  • Jelena Mirkovic, Viterbi School of Engineering

STUDENTS:

  • Alex Zihuan Ran, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Hardik Mahipal Surana, M.Sc. student in Computer Science, Viterbi School of Engineering

5. Connections within Contemporary Feminism Movements

This project will look at events data collected at several recent feminist social movements and understand their connection to each other. Specifically, it will explore individuals or organizations that played instrumental role in movement mobilization, relationship brokerage between movements, and building and sustaining activist communities.  Previous research suggests that those distinctive movements are often not isolated incidents, but mobilized by a core group of “leaders” or similar ideas and frames. The goal is to understand how seemingly disconnected movements relate to one another help to reveal the lasting impact of mediated movements.

SKILLS NEEDED: Python, text mining, network analysis.

WHAT STUDENTS WILL LEARN: Communication data analysis.

ADVISOR:

  • Aimei Yang, Annenberg School of Communication

STUDENTS:

  • Keerti Bhogaraju, M.Sc. student in Applied Data Science, Viterbi School of Engineering
  • Ian Myoungsu Choi, M.Sc. student in Applied Data Science, Viterbi School of Engineering
  • Negar Mokhberian, Ph.D. student in Computer Science, Viterbi School of Engineering
  • Nazanin Alipourfard, Ph.D. student in Computer Science, Viterbi School of Engineering

6. Tracking health and nutrition signals from social media data

This project will explore the ability to track real-life health and nutrition signals from social media data, focusing on data from Instagram and Foursquare. We will investigate the quality of Instagram posts as a source of data for measurements of dietary patterns and nutrition quality, focusing on spatial, textual, and (*new in this semester*) image content of posts linked to food outlets in Los Angeles, as well as nutritional content analysis of menus available online. Multiple aims will be investigated in this project, including: scraping data from social media; NLP of tag, comments, and menu data; image analysis; predictive models and social network analysis; and more. Also new in this semester: “ground truth” data on dietary patterns of LA residents will be available, enabling validation of dietary measures and predictive models built from Instagram posts.

The project will build on the DataFest 2019 project, and will expand the scope to actually access up-to-date data from Instagram, in particular: data with images, the underlying social connections / social network, and of course more timely (which requires data scraping).

SKILLS NEEDED: Programming in Python or R; machine learning; statistical analysis Optional: Social network analysis, image analysis, NLP, sentiment analysis

WHAT STUDENTS WILL LEARN: Tracking real-life health signals from social media data; evaluating its quality and representativeness from a health health perspective; Spatial statistical analysis using big data, combined from various sources (social media data, official public health statistics); Building predictive models for public health; Possible experience to participate in writing conference abstracts and journal papers.

ADVISORS:

  • Andrés Abeliuk, Viterbi School of Engineering
  • Abigail Horn, Keck School of Medicine
  • Kayla de la Haye, Keck School of Medicine
  • Yelena Mejova, ISI Foundation in Turin, Italy

STUDENTS:

  • Abhilash Karpurapu, M.Sc. student in Applied Data Science, Viterbi School of Engineering
  • Erica Xia, M.Sc. student in Applied Data Science, Viterbi School of Engineering
  • Iris Liu, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Spoorti Nidagundi, M.Sc. student in Computer Science, Viterbi School of Engineering

7. Predicting Effective Tax Rate of Publicly-Traded Firms

The purpose of this project is to analyze business firms’ text disclosures to determine if those text disclosures are related to firms’ tax rates. In so doing we first capture information about the text and then relate that text information to quantitative information, using statistical modeling. So far, we have generated and used some bags of words to capture information that we expect will provide insight into the tax rates that those firms incur. Our knowledge acquisition approach, to gather those bags of words, was to interview an expert. We then counted the number of occurrences of those words in our text, and used statistical models to relate the number of those occurrences to different measures of tax rates. We find that those bags of words are statistically significantly related to measures of tax rates that firms pay. In addition, we find that “tax specific bags of words” work “better” than “generic accounting bags of words.”
Current Research. We would like to expand our approach beyond bag of words approaches to potentially include phrases (groups of words) and account for the order of the words. We also would like to expand our research to an approach that is less heuristic.

POINTERS: Our data comes from financial reports filed with the SEC (Security and Exchange Commission) – it is text data. Our data source is accounting text disclosures that give us 81,600 firm years with 16,681 unique firms. In addition to the text, we gather quantitative data on tax rates and other control variables for our statistical models.

SKILLS NEEDED: Text analysis and statistical analysis

WHAT STUDENTS WILL LEARN: Text and statistical analysis of business disclosures.

ADVISOR:

  • Daniel E. O’Leary, Leventhal School of Accounting and Marshall School of Business

STUDENTS:

  • Jae Young Kim, M.Sc. student in Applied Data Science, Viterbi School of Engineering
  • Yuqin Jiang, M.Sc. in Machine Learning and Data Science, Viterbi School of Engineering
  • Kanlin Cheng, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Saravanan Manoharan, M.Sc. student in Computer Science, Viterbi School of Engineering

8. Text Analysis, Social Networks and Crowdsourcing

The purpose of this project is to analyze a crowdsourcing setting for both the sentiment and other categories of meaning in the text, and the roles and impact of a network of contributors on the votes and potentially on the content.

POINTERS: Our data comes from a large crowdsourcing settings.

SKILLS NEEDED: Text analysis and Social Network analysis.

WHAT STUDENTS WILL LEARN: Crowdsourcing, text analysis and social network analysis, and their interface.

ADVISORS:

  • Daniel E. O’Leary, Leventhal School of Accounting and Marshall School of Business
  • Keith Burghardt, Viterbi School of Engineering

STUDENTS:

  • Dan Peng, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Gitanjali Kanakaraj, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Hanieh Arabzadehghahyazi, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Naiya Shah, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Nana Andriana, M.Sc. student in Computer Science, Viterbi School of Engineering

9. Modelling Spatiotemporal Relationships between Waste Water Injection and Induced Seismicity

Induced seismicity refers to earthquakes that are caused as a result of human activity, such as disposing of wastewater by injecting it into the subsurface. This project will focus on spatiotemporal statistics to model space-time relationships between injected wastewater and induced earthquakes. The model will incorporate space-time data pertaining to seismic activity and associated human systems to create forecasts of induced earthquakes.

POINTERS: https://www.usgs.gov/natural-hazards/earthquake-hazards/induced-earthquakes?qt-science_support_page_related_con=4#qt-science_support_page_related_con, http://www.ou.edu/ogs/research/earthquakes/catalogs, http://www.occeweb.com/og/ogdatafiles2.htm

SKILLS NEEDED: R and Python programming, basic statistics, data analysis. Optional: GIS, Python

WHAT STUDENTS WILL LEARN: Spatiotemporal analysis of multi-dimensional data, niche space-time visualization methods, implementing an end-to-end machine-learning workflow to a multidisciplinary problem.

ADVISOR:

  • Orhun Aydin, Dornsife College of Letters, Arts and Sciences

10. Data scraping for salary benchmarking

This project will develop a data scraper to collect salary records from a website that provides compensation data for faculty at public universities. When provided a list of faculty names and institutional affiliations, this program will search for the associated records, extract the relevant results, and copy the data into a spreadsheet. The purpose of this project is to explore the feasibility of automating an otherwise time-consuming data collection task required for benchmarking of faculty salaries in relation to peer institutions. This will ultimately facilitate a number of important tasks, including analysis of potential salary disparities within certain disciplines and faculty tracks.

POINTERS: Data to be obtained from https://ucannualwage.ucop.edu/wage/.  A report from 2018-19 USC RTPC Faculty Affairs Committee  describes a similar data collection (conducted manually) and illustrates the sorts of analyses that this data could facilitate: https://academicsenate.usc.edu/files/2019/06/RTPCFAC-2018-19-white-paper-on-salary-benchmarking.pdf.

SKILLS: Web extraction and data integration

WHAT STUDENTS WILL LEARN: Create a data scraper to extract selected records from an online database and copy them to an offline spreadsheet.

ADVISORS:

  • T.J. McCarthy, Price School of Public Policy
  • Ginger Clark, USC
  • Fred Morstatter, Viterbi School of Engineering

STUDENTS:

  • Matthew Lim, Undergraduate (PDP) student in Computer Science, Viterbi School of Engineering
  • Minh Nguyen, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Kevin Tsang, M.Sc. student in Applied Data Science, Viterbi School of Engineering
  • Sindhu Ravi, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Vihang Mangalvedhekar, M.Sc. student in Computer Science, Viterbi School of Engineering

11. Microtelcos and the Digital Divide in CA

The broadband access market is increasingly dominated by a few large ISPs. However, small and medium-size operators (“microtelcos”) are critical to connectivity in low-income and rural communities in CA, serving markets of little interest to large operators. The primary goal of this project is to combine broadband infrastructure deployment data from the CPUC (California Public Utilities Commission) and socioeconomic data from the Census Bureau to understand the characteristics of the communities served by microtelcos, and to analyze whether the presence of a microtelco operator contributes to higher levels of connectivity in the community. The main technical challenge is to combine spatial data found in CPUC files with census block level data provided by the Census Bureau. This is part of an ongoing research program called Connected Communities and Inclusive Growth (CCIG).

POINTERS: The CPUC data is at: https://www.cpuc.ca.gov/Broadband_Availability/
CCIG website: https://arnicusc.org/research/connected-cities/

SKILLS NEEDED: R or Stata, some knowledge of ArcGIS desirable

WHAT WILL STUDENTS LEARN: Spatial econometrics, policy analysis of broadband infrastructure, data visualization.

ADVISOR:

  • Hernan Galperin, Annenberg School for Communication

12. Disparities in educational achievement

The project will combine socio-economic data from US Census with college and K-12 performance data to identify correlates of positive educational outcomes. Of specific interest will be assessing how economic inequalities and racial disparities affect educational achievement in different regions of US.

POINTERS: US college data: at https://nces.ed.gov/ipeds/use-the-data and https://collegescorecard.ed.gov/data/; how the data was used https://cew.georgetown.edu/cew-reports/CollegeROI/; international K12 data https://www.oecd.org/pisa/data/.

SKILLS NEEDED: Python required, familiarity with statistics

WHAT STUDENTS WILL LEARN: Students will learn methods for heterogeneous data analysis, including mixed effects and fair models

ADVISOR:

  • Kristina Lerman, Viterbi School of Engineering

STUDENTS:

  • Yuzi He, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Ziping Hu, M.Sc. student in Applied Data Science, Viterbi School of Engineering
  • Zicai Wang, M.Sc. student in Applied Data Science, Viterbi School of Engineering

13. Gender inclusion in science

This project will measure representation of women in various scientific disciplines across time (and different countries) and identify institutions who have succeeded in creating a more welcoming environment for women. While there are already studies that use bibliographic data to map career trajectories of women, they do not focus on the role that institutions (and countries) – and their policies – play in retaining female researchers.

The data comes from Microscoft Academic Graph, containing millions of papers from institutions around the world across many decades. We will use Ethenea API to extract gender (and ethnicity) of authors.

POINTERS: Microsoft Academic Graph data will be provided.

SKILLS NEEDED: Python required, familiarity with statistics

WHAT THE STUDENTS WILL LEARN: Students will learn methods for heterogeneous data analysis, including mixed effects and fair models

ADVISOR:

  • Kristina Lerman, Viterbi School of Engineering
  • Goran Muric, Viterbi School of Engineering

STUDENTS:

  • Ninareh Mehrabi, PhD student in Computer Science, Viterbi School of Engineering
  • Aditya Gupta, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Vineetha Nadimpalli, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Ayushi Jha, M.Sc. student in Computer Science, Viterbi School of Engineering

14. Annotating Paleoclimate Data

Paleoclimate data is highly diverse, requiring different sets of metadata to describe the various datasets. In this project, you will help build an interface to assist researchers in annotating their paleoclimate datasets according to an evolving reporting standard (PaCTS) and download them in the Linked Paleo Data (LiPD) format. The interface should be highly interactive (wizard) to accommodate the diversity of the data as well as offer editing capabilities for existing datasets (upload LiPD files) and check their compliance with PaCTS, plot location information and the time series, download into the LiPD format, upload to a semantic wiki and/or an SQL database. In addition, the interface should support the use of a recommender system (to be build) to help researchers in annotating their datasets.

POINTERS: To  build the recommender system, 700 expertly annotated datasets are available here: http://wiki.linked.earth/Main_Page. Other datasets  will be made available through Dropbox/Figshare.  A preliminary interface is available here: http://lipd.net/playground Code will be made available to students.  Standard available here: http://wiki.linked.earth/PaCTS_v1.0

SKILLS NEEDED: Javascript, Python, Web technologies. No prior knowledge about paleoclimate data is needed.

WHAT STUDENTS WILL LEARN: UI/UX design, recommender systems, web technologies

ADVISORS:

  • Deborah Khider, Viterbi School of Engineering
  • Julien Emile-Geay, Dornsife College of Letters, Arts, and Sciences

STUDENTS:

  • Yincheng Lin, M.Sc. student in Applied Data Science, Viterbi School of Engineering
  • Shravya Manety, M.Sc. student in Computer Science, Viterbi School of Engineering

15. Modeling Uncertainty in Drought Products

Droughts can have a substantial impact on agricultural systems and human livelihood. A Python package to calculate various drought indices in being developed. In this project, you will expand on this package and develop methods to test the sensitivity of the models to various input datasets and parameters. In addition, you will develop post-processing code to determine the return period of the drought (is it a 1 in 20 yr event or 1 in 5 yr event?).

The project will build on the DataFest Fall 2019 project to create preliminary workflows/visualization tools for sensitivity analysis in drought products.

POINTERS: Drought products will be generating from national weather products available here: https://data.mint.isi.edu/files/

SKILLS NEEDED: Python, Probability theory, statistics

WHAT STUDENTS WILL LEARN: Uncertainty modeling, statistical modeling

ADVISOR:

  • Deborah Khider, Viterbi School of Engineering

16. Characterizing the counter-narratives of climate change

Top climate scientists post their findings and views regularly on social media. These very scientists are met with tweets from those with opposing views, often containing vitriolic and false information. It is important that we can identify and characterize these tweets to understand the counter-narratives of climate change. We will address topics including false information, bot compaigns, and harassment.

POINTERS: Students will collect tweets from Twitter.

SKILLS NEEDED: Data collection, basic classification.

WHAT STUDENTS WILL LEARN: Data scraping, Machine learning, Text classification, Computational social science

ADVISORS:

  • Fred Morstatter, Viterbi School of Engineering
  • Deborah Khider, Viterbi School of Engineering

STUDENTS:

  • Abhilash Pandurangan, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Aditya Jajodia, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Sushmitha Ravikumar, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Vanshika Sridharan, M.Sc. student in Computer Science, Viterbi School of Engineering

17. Digital Democracy: Using Social Media to Improve Political Discourse

Politicians in modern democracies across the world have eagerly adopted social media for engaging their constituents, entering into direct dialogs with citizens. From the perspective of political actors, there is a need to continuously gather, monitor, analyze, and visualize politically relevant information from online social media with the goal to improve communication with citizens and voters. The goal of this proposal is to create a tool that enhances interaction and dialogue between political actors and their followers. This will be achieved by creating compact and comprehensive summaries that aggregate and visualizes common narratives, thus, reducing the cognitive load required to read all the messages and streamlining the dialogue experience.

POINTERS: We will be using Twitter data from the current US presidential campaign.

SKILLS NEEDED: programing, NLP experience.

WHAT STUDENTS WILL LEARN: to implement state of the art NLP techniques, research skills, and teamwork skills.

ADVISORS:

  • Andrés Abeliuk, Viterbi School of Engineering

STUDENTS:

  • Alex Spangher, Ph.D. student in Computer Science, Viterbi School of Engineering
  • Yash Shah, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Swetha Thomas, M.Sc. student in Electrical Engineering, Viterbi School of Engineering
  • Hongyu Li, M.Sc. student in Analytics, Viterbi School of Engineering
  • Raveena Kshatriya, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Abhi Thadeshwar, M.Sc. student in Computer Science, Viterbi School of Engineering

18. Turning Library Collections into Data Science Challenges and Resources

Libraries, museums, and archives hold unique collections that may be very useful for data science. These collections include photographs, videos, letters, and other artifacts that could give unique insights when analyzed. In this project, students will work with the USC Libraries to identify existing collections that would be potentially interesting as targets for data science, describe those collections in collaboration with the USC Libraries so they can be promoted as data science resources, and create APIs and other access mechanisms for data science researchers on campus and beyond.

POINTERS: An example of a unique collection at USC is the Dance Heritage Video Archive: http://digitallibrary.usc.edu/cdm/landingpage/collection/p15799coll105.

SKILLS NEEDED: Interest and basic knowledge in data science.

WHAT STUDENTS WILL LEARN: Identifying data science opportunities and projects, developing datasets for data science challenges, learning to work with humanities researchers.

ADVISORS:

  • Yolanda Gil, Viterbi School of Engineering
  • Deborah Holmes-Wong, USC Libraries

STUDENTS:

  • Chaitra Mudradi, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Feilong Wu, M.Sc. student in Applied Data Science, Viterbi School of Engineering
  • Hsing-Hsien Wang, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Maria MacHarrie, M.Sc. student in Applied Data Science, Viterbi School of Engineering
  • Shubhankar Singh, M.Sc. student in Computer Science, Viterbi School of Engineering
  • Yangtao Hu, M.Sc. student in Computer Science, Viterbi School of Engineering

19. Capturing the Provenance of Data Analysis Using the PROV Standard

Documenting how a result was obtained from data analysis involves documenting the software, software settings, and datasets used to obtain that result so it can be explained properly. The current ASSET interface enables users to document the provenance of data analysis no matter what infrastructure they used (R scripts, sk-learn, etc). This project will focus on capturing provenance records for data science projects and using the W3C PROV standard to export those records. It will also develop tools to mine provenance data to find common patterns of use.

POINTERS: This project will extend the ASSET workflow sketching interface (http://asset-project.info/sketching.html) to capture provenance sketches.

SKILLS NEEDED: Knowledge graphs, or user interface development in Firebase and JavaScript.

WHAT STUDENTS WILL LEARN: Interfaces for capturing data analysis steps, provenance standards, data analysis workflow representations.

ADVISORS:

  • Yolanda Gil, Viterbi School of Engineering

20. Team Dynamics in Online Multiplayer Games

Competitive online multiplayer team games such as CounterStrike, PUBG or League of Legends are extremely popular. Multiple teams of professional players compete in hundreds of tournaments yearly. Player transfer between teams are common. The goal of the project is to measure the effects of player transfers and to answer some of the questions such as: How does a new player affect the team’s performance?; How does the change of a team affects player’s performance? The world of online games can be used as a fruitful area for tackling more fundamental questions on human society and collaboration dynamics in different settings.

POINTERS: Datasets are at https://liquipedia.net/

SKILLS NEEDED: Python

WHAT STUDENTS WILL LEARN: Statistical analysis, building a research pipeline, REST APIs, data visualization and representation

ADVISOR:

  • Goran Muric, Viterbi School of Engineering

STUDENTS:

  • Kevin Tsang, M.Sc. student in Applied Data Science, Viterbi School of Engineering
  • Jiaqi Liu, M.Sc. student in Electrical Engineering, Viterbi School of Engineering