Area 5: New Challenges for New Data
Sources of data are more and more diverse, and each source generates vast amounts of information: the internet generates data on social networks; firms have access to data with more variables than observations, administrative data that may include the entire population (on employment, health…). The quantities produced are so enormous that classical statistical methods are often not adapted to handle them, either because of the number of variables or because of the number of observations.
This area therefore intends to couple the Center for Secured Access to Data (CASD, an EQUIPEX) with a Center for the Study of Big Data where econometricians, statisticians, and micro-economists will identify new forms of data and then design the relevant new methods to use this kind of data.
Because information is often stored at the individual or the firm level, privacy concerns become central. The Center tackles these issues by fostering cooperation between statisticians and lawyers. It is believed that the use of the CASD will overcome the reluctance of firms and administrations to let researchers access to these data. This in turn should open wide a fantastic, but so far unexplored, field of new multi-disciplinary research.
Here is their presentation to the last Board of Trustees (September 2019).
Here is an updated list of the Area publications.