ACES Student Helps Research Methods Identifying Program Authorship

news story image

Jeffrey Micher, Aylin Caliskan-Islam, Dr. Richard Harang, Andrew Liu, Dr. Clare Voss

Open campus initiative brings natural language processing to cyber research

This summer, second-year ACES student Andrew Liu interned at the Army Research Laboratory at Adelphi through College Qualifies Leaders. In collaboration with others, Liu's research project worked to improve methods identifying authorship of unknown programs. They used stylometry, checking the style of a program's source code, and Abstract Syntax Trees, mapping the logical flow of a program, to indentify the authorship of anonymous programs. 

Do authors of source code have a unique identifying style—a "coding fingerprint"—that can be learned from a few samples of their work and used to identify them?

This question holds implications for protecting intellectual property as well as for identifying malware authors and tracking the evolution of malware. It spurred a cross-cutting summer project that exercised Open Campus to bring disciplines, institutions, and experts together.

Aware that methods from text analytics can strengthen cyber analytics, U.S. Army Research Laboratory researchers Dr. Clare Voss, a computational linguist, and Dr. Richard Harang, a network-science researcher, met to define a seed problem with Drexel University Professor Rachel Greenstadt, a leader in electronic-privacy and information-security research who has pioneered methods for authorship attribution in both prose and source code.

You can read more about Andrew's project at United States Army Research Laboratory(link is external).

Published October 6, 2014