# Course project

Proposal deadline: March 6, 2018 — one or two paragraphs describing a paper you are interested in working on or a dataset you are planning on exploring.

Project presentations: April 17, 2018

Deadline for paper submissions: May 4, 2018 at noon.

This project is meant to demonstrate some of the things you have learned in the class applied to real world data or theory problems. The project is rather open ended but must come from one of the following themes:

- Replication of an existing network paper: This type of project requires replicating (that is writing your own code!) someone else’s work, testing how well it works and maybe trying to extend it a little bit.
- Data analysis applying tools from class: This type of project requires a desire to do some novel data analysis. You should find your own network dataset and explore it — the goal here is to do more than just count the number of edges and triangles or to fit a stochastic blockmodel but rather to find interesting insights into some real world phenomenon.
- Original research with a network component: If your research already involves networks, be it theoretical, methodological and data-analytic, present it in a compelling fashion.

Potential papers to reproduce:

- Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2008). Mixed membership stochastic blockmodels. Journal of Machine Learning Research, 9(Sep), 1981-2014.
~~Latouche, P., Birmelé, E., & Ambroise, C. (2011). Overlapping stochastic block models with application to the french political blogosphere. The Annals of Applied Statistics, 309-336.~~taken- Airoldi, E. M., Costa, T. B., & Chan, S. H. (2013). Stochastic blockmodel approximation of a graphon: Theory and consistent estimation. In Advances in Neural Information Processing Systems (pp. 692-700).
- Durante, D., & Dunson, D. B. (2014). Nonparametric Bayes dynamic modelling of relational data. Biometrika, 101(4), 883-898.
- Fosdick, B. K., & Hoff, P. D. (2015). Testing and modeling dependencies between a network and nodal attributes. Journal of the American Statistical Association, 110(511), 1047-1056.

Possible data analysis projects to take on:

- Create/download many network datasets and evaluate them for things like small world property, heavy tailed degree distributions, etc.
- Find an interesting data set that has been analyzed using network model A and analyze it with network model B — compare and contrast the results.
- Updated version of Adamic, L. A., & Glance, N. (2005, August). The political blogosphere and the 2004 US election: divided they blog. In Proceedings of the 3rd international workshop on Link discovery (pp. 36-43). ACM. (Maybe using Twitter or Reddit data?)