AI-driven Materials Discovery

Data-driven materials science, often titled the “4th paradigm”, has opened an avenue towards material discovery through statistically driven machine learning approaches. Due to the continued increase in computing power and improvements of theoretical methods, the accuracy of predicted material properties has reached a reliability comparable to experiments while greatly surpassing them in terms of speed and cost. This gave rise to a rapid increase in available open-source material databases facilitating material discovery at an unprecedented scale. Many renewable energy challenges are limited by the discovery of a material with tuned, exceptional properties. The 4th paradigm of science provides a critical toolset to tackle these new material challenges.
The distinction between the classical Edisonian approach and the newly accessible in-silico approach is illustrated below. For the former approach, potential material candidates are tested one-by-one, based on chemical intuition or similarity to previously successful material candidates. As experimental confirmation and/or high-fidelity, ab-inito simulations of individual candidates is time- and cost-intensive, this approach restricts progress to longer time scales. For the later approach, however, previous knowledge is systematically used to generate a statistically driven model to predict properties of new material candidates. The machine learning model facilitates the down-selection of promising material candidates at a fraction of the time and cost of conventional experimental or computational techniques.

ALD cycle illustration

For my most recent research focus, I have been collaborating with experts in machine learning (Prof. Evan Reed, Stanford University, and Prof. Christoph Dellago, University of Vienna) to screen a wide range of materials for potential low work function candidates with high performance computing. The work focuses on using machine learning to develop a statistically driven surrogate model to predict material surfaces’ work function. The work flow is illustrated below:

ALD cycle illustration

Using this approach, I have established an extensive database of work functions for over 30,000 material surfaces with high-throughput, ab-initio calculations. The distribution of work functions for all surfaces is plotted in the figure below. This large database enabled us to generate a machine learning model with a test-error comparable to the accuracy of state-of-the-art, DFT work function calculations while being several orders of magnitude faster. This model paves the way towards the efficient screening of a wide range of possible materials for new low work function candidates.

ALD cycle illustration

Selected Publications

  • P. Schindler, E. R. Antoniuk, G. Cheon, Y. Zhu, and E. J. Reed, Discovery of materials with extreme work functions by high-throughput density functional theory and machine learning, arXiv:2011.10905 (2020) arXiv:2011.10905

Collaborators / Advisors