Data Science & AI Development

Working Paper
Hannah Mayer, Jin Paik, Jenny Hoffman, and Steven Randazzo. Working Paper. “AI in Enterprise: AI Product Management”. AI in Enterprise - AI Product Management (P Skomoroch).pdf
Jin Paik, Steven Randazzo, and Jenny Hoffman. Working Paper. “AI in the Enterprise: How Do I Get Started?”.Abstract

While there are dispersed resources to learn more about artificial intelligence, there remains a need to cultivate a community of practitioners for cyclical exposure and knowledge sharing of best practices in the enterprise. That is why Laboratory for Innovation Science at Harvard launched the AI in the Enterprise series, which exposes managers and executives to interesting applications of AI and the decisions behind developing such tools. 

Moderated by HBS Professor and co-author of Competing in the Age of AI, Karim R. Lakhani, the most recent virtual session with over 240 attendees featured Rob May, General Partner at PJC, an early-stage venture capital firm, and founder of Inside AI, a premier source for information on AI, robotics and neurotechnology. Together, they discussed why we have seen a rise in interest in AI, what managers should consider when wading into the AI waters, and what steps they can take when it is time to do so. 

AI in Enterprise - How Do I Get Started (R May).pdf
Roberto Verganti, Luca Vendraminelli, and Marco Iansiti. 3/19/2020. “Innovation and Design in the Age of Artificial Intelligence”. Publisher's VersionAbstract

At the heart of any innovation process lies a fundamental practice: the way people create ideas and solve problems. This “decision making” side of innovation is what scholars and practitioners refer to as “design”. Decisions in innovation processes have so far been taken by humans. What happens when they can be substituted by machines? Artificial Intelligence (AI) brings data and algorithms to the core of innovation processes. What are the implications of this diffusion of AI for our understanding of design and innovation? Is AI just another digital technology that, akin to many others, will not significantly question what we know about design? Or will it create transformations in design that current theoretical frameworks cannot capture?

This article proposes a framework for understanding design and innovation in the age of AI. We discuss the implications for design and innovation theory. Specifically, we observe that, as creative problem solving is significantly conducted by algorithms, human design increasingly becomes an activity of sense making, i.e. understanding which problems should or could be addressed. This shift in focus calls for new theories and brings design closer to leadership, which is, inherently, an activity of sense making.

Our insights are derived from and illustrated with two cases at the frontier of AI ‐‐ Netflix and AirBnB (complemented with analyses in Microsoft and Tesla) ‐‐, which point to two directions for the evolution of design and innovation in firms. First, AI enables an organization to overcome many past limitations of human‐intensive design processes, by improving the scalability of the process, broadening its scope across traditional boundaries, and enhancing its ability to learn and adapt on the fly. Second, and maybe more surprising, while removing these limitations, AI also appears to deeply enact several popular design principles. AI thus reinforces the principles of Design Thinking, namely: being people‐centered, abductive, and iterative. In fact, AI enables the creation of solutions that are more highly user‐centered than human‐based approaches (i.e., to an extreme level of granularity, designed for every single person); that are potentially more creative; and that are continuously updated through learning iterations across the entire product life cycle.

In sum, while AI does not undermine the basic principles of design, it profoundly changes the practice of design. Problem solving tasks, traditionally carried out by designers, are now automated into learning loops that operate without limitations of volume and speed. The algorithms embedded in these loops think in a radically different way than a designer who handles complex problems holistically with a systemic perspective. Algorithms instead handle complexity through very simple tasks, which are iterated continuously. This article discusses the implications of these insights for design and innovation management scholars and practitioners.

Marco Iansiti and Karim R. Lakhani. 3/3/2020. “From Disruption to Collision: The New Competitive Dynamics.” MIT Sloan Management Review.Abstract
In the age of AI, traditional businesses across the economy are being attacked by highly scalable data-driven companies whose operating models leverage network effects to deliver value.
Karim R. Lakhani, Andrew Hill, Po-Ru Loh, Ragu B. Bharadwaj, Pascal Pons, Jingbo Shang, Eva C. Guinan, Iain Kilty, and Scott Jelinsky. 2017. “Stepwise Distributed Open Innovation Contests for Software Development: Acceleration of Genome-Wide Association Analysis.” GigaScience, 6, 5, Pp. 1-10. Publisher's VersionAbstract

BACKGROUND: The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets.

RESULTS: Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project.

CONCLUSIONS: Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics.

Christoph Riedl, Richard Zanibbi, Marti A. Hearst, Siyu Zhu, Michael Menietti, Jason Crusan, Ivan Metelsky, and Karim R. Lakhani. 2016. “Detecting Figures and Part Labels in Patents: Competition-Based Development of Image Processing Algorithms.” International Journal on Document Analysis and Recognition (IJDAR), 19, 2, Pp. 155-172. Publisher's VersionAbstract

Most United States Patent and Trademark Office (USPTO) patent documents contain drawing pages which describe inventions graphically. By convention and by rule, these drawings contain figures and parts that are annotated with numbered labels but not with text. As a result, readers must scan the document to find the description of a given part label. To make progress toward automatic creation of ‘tool-tips’ and hyperlinks from part labels to their associated descriptions, the USPTO hosted a monthlong online competition in which participants developed algorithms to detect figures and diagram part labels. The challenge drew 232 teams of two, of which 70 teams (30 %) submitted solutions. An unusual feature was that each patent was represented by a 300-dpi page scan along with an HTML file containing patent text, allowing integration of text processing and graphics recognition in participant algorithms. The design and performance of the top-5 systems are presented along with a system developed after the competition, illustrating that the winning teams produced near state-of-the-art results under strict time and computation constraints. The first place system used the provided HTML text, obtaining a harmonic mean of recall and precision (F-measure) of 88.57 % for figure region detection, 78.81 % for figure regions with correctly recognized figure titles, and 70.98 % for part label detection and recognition. Data and source code for the top-5 systems are available through the online UCI Machine Learning Repository to support follow-on work by others in the document recognition community.