07.31.17 | Computer Science

Monday, July 31, 2017

CSE Grad Student Helps Google Estimate $25 Million in Ransomware Payouts

$25,253,505. That is the best estimate to date of how much money was paid by victims of ransomware attacks in the past two years in order to unlock their computer disks and get their data back. As a result, ransomware - malware that encrypts victims' data and demands a payoff in exchange for the key to unlock the data - "has become one of the largest cybercrime revenue sources," according to Google presenters at Black Hat USA 2017 conference in Las Vegas this week.

In 2016, ransomware became a multi-million-dollar
business.

Participants in the study on "Tracking Ransomware End to End" included researchers from UC San Diego, New York University (NYU), and the blockchain analysis firm Chainalysis. (Blockchain is the public, decentralized ledger of transactions in Bitcoin, the cryptocurrency most widely used to settle ransomware demands.)

Rather than produce an academic paper first, the team opted to make a splash at the conference with a presentation to get the word out. The presenter: Google's Kylie McRoberts. Now in its 20th year, Black Hat is the world's leading information security event series.

Ph.D. candidate Danny Y.
Huang

The UC San Diego participant in the study, Computer Science and Engineering (CSE) Ph.D. candidate Danny Yuxing Huang, is also affiliated with the Center for Networked Systems (CNS). "We study the economics of operating ransomware: from maintaining infrastructure, generating revenue, to getting victims to pay," noted Huang, adding that "our goals are to understand the business model of ransomware, and estimate their revenue and potential profitability."

Huang tracked bitcoins that moved from potential victims to ransomware, and from ransomware to exchanges (as possible liquidation). "By masquerading as a part of the ransomware infrastructure," explained Huang, "I also gathered statistics on infected computers, such as the number of infections over time, and the geographical distribution of infected machines."

Former postdoc Damon
McCoy, now an NYU
professor.

Google's other university collaborator was Damon McCoy, a former postdoctoral scientist in CSE at UC San Diego from 2009 to 2011, who is now an assistant professor of computer science in NYU's Tandon School of Engineering.

Together, the researchers investigated 300,000 files from 34 different types of ransomware and tracked payments on the blockchain to analyze the scale and the amount of money paid by victims.

In the presentation, Google's McRoberts reported that search queries for the term "ransomware" have increased by 877 percent since 2016, the first year when ransomware became a multi-million-dollar business (see chart).

Ransomware notice

Of the $25 million in payments by Internet users to get their data back, some ransomware attacks generated more revenue than others. Only a fraction of the total was paid by victims of the widely publicized WannaCry ransomware in 2017, despite - or because of - the extensive damage it caused. Developed originally by the U.S. National Security Agency (NSA), WannaCry crippled hospitals (including Britain's National Health Service), communications providers and some 10,000 other organizations as well as an estimated 200,000 individuals in more than 150 countries. Even so, payouts in response to WannaCry topped out at $140,000 - making it only the 11th-largest ransomware to date in terms of victim payouts. The Google presenter dubbed WannaCry an "impostor," saying it should really be classified as "wipeware." The study found that victims learned early on that the malware effectively wiped out the data because the software was unable to later unlock the victim's computer even if the ransom was paid. The study noted that a variant on WannaCry called NotPetya was also wipeware, for the same reason, but also concluded that "wipeware pretending to be ransomware is on the rise."

Less publicized ransom demands launched in 2016, on the other hand, generated far more income for the attackers than WannaCry, notably the Locky ($7.8 million to date) and Cerber ($6.9 million) ransomware attacks.

Locky, the first ransomware to make over $1 million
per month.

Locky was the first ransomware to make over $1 million per month. It has largely run its course, but left its mark on the criminal marketplace because it brought "ransoms to the masses", according to the presentation at Blackhat USA. "Locky's big advantage was the decoupling of the people who maintain the ransomware from the people who are infecting machines," said NYU professor McCoy. "Locky just focused on building the malware and support infrastructure. Then they had other botnets spread and distribute the malware, which were much better at that end of the business."

The same botnet that distributed Locky now also distributes Cerber and other ransomware built on Locky's model. Cerber continues to rake in roughly $200,000 a month in ransom, as it has for more than a year, buoyed by its creation of an affiliate model that is "taking the world by storm," noted Google.

According to the study, victims of all ransomware paid ransom by purchasing Bitcoins on at least 10 exchanges. The single largest market, LocalBitcoins.com, had 37% of the market in the two-year period.

The $25 million number in the new study reflects total ransomware payouts by victims. It is unclear, however, how much of the money made it back to the original authors of that ransomware.

UC San Diego contributor Danny Huang is nearing completion of his Ph.D. under advisors Alex Snoeren and research scientist Kirill Levchenko. He is scheduled to mount the final defense of his dissertation at the end of August. Huang and his colleagues are working on an academic paper they hope to publish later this year on their ransomware findings.

UC San Diego Economics of Ransomware Website
Presentation: Tracking Desktop Ransomware Payments (PDF)

CSE Professor Honored with Cognitive Science Society Fellowship

Computer Science and Engineering (CSE) professor Gary Cottrell calls himself "a cognitive scientist collecting a computer science salary", noting that this is "much better than the other way around." Now, he has been elected a Fellow of the Cognitive Science Society, an honor reflecting his "impact on the Cognitive Science community and... sustained record of excellence in research contributions."

CSE professor Gary Cottrell

After earning his Ph.D. in Computer Science from the University of Rochester in 1985, Dr. Cottrell came to San Diego to do a post-doc at the Center for Human Information Processing with the mathematical psychologist David E. Rumelhart, one of the discoverers of the extremely popular backpropagation learning algorithm for neural networks. Following his post-doc, he was hired by the Computer Science and Engineering department, and has been teaching computer science at UC San Diego ever since.

The CSE professor is now being honored for his work in Cognitive Science, including the 12 years he has been Director of the UC San Diego-based Temporal Dynamics of Learning Center (TDLC), a National Science Foundation funded Science of Learning Center that he heads with Andrea Chiba of the Cognitive Science Department. At UC San Diego, Cottrell also directs the Interdisciplinary Ph.D. Program in Cognitive Science.

At the upcoming Cognitive Science Society annual meeting in London, Fellows in the Class of 2017 will be treated to a free dinner. Cottrell laments being elected this year, because his free dinner will be "British food." However, always ready to see a silver lining, he notes that if he had been elected last year, when the conference was in Philadelphia, it would have probably been Philly cheesesteak (Cottrell is a vegetarian).

The annual meeting will run July 26th-29th, and this year's theme is "Computational Foundations of Cognition." Cottrell has been a member of the meeting's program committee for decades, and was the program chair when it was held at UC San Diego in 1996.

Cottrell's papers in this year's conference include one entitled "Learning to See People Like People: Predicting Social Perceptions of Faces." Cottrell's co-authors include first author Amanda Song, a Ph.D. student in Cognitive Science (co-advised by Cottrell); ECE alumna Linjie Li (M.S. '16), who is now doing her Ph.D. at Purdue University; and CSE junior Chad Atalla, a machine learning undergraduate researcher in the Cottrell Lab, where he studies facial attractiveness predictors.

Image of four of the features used for attractiveness. Faces
maximally activated one of four 'neurons' in the model, along with
'deconvolution' of the feature back to the input to see what part of
image caused the activation. Note that not all faces are attractive:
you need many more units than four to be active to be considered
'attractive' by the model.

Humans make complex inferences on faces, ranging from objective properties (gender, ethnicity, expression, age, identity, etc.) to subjective judgments (facial attractiveness, trustworthiness, sociability, friendliness, etc.). While the objective aspects of face perception have been extensively studied, relatively fewer computational models have been developed for the social impressions of faces. Bridging this gap, Cottrell's team developed a method to predict human impressions of faces in 40 subjective social dimensions, using deep representations from state-of-the-art neural networks. Cottrell notes that these subjective impressions do not necessarily reflect objective truth, but could be useful in social robots, who will need to understand how people view each other. This work could also be used to select your best Facebook profile image. The paper is available online from Cottrell's publications page.

A second paper concerns the brain's hemispheric asymmetries. Your left hemisphere's visual system tends to process fine detail (AKA "high spatial frequencies"), while the right hemisphere responds best to low-resolution features (AKA "low spatial frequencies"). Due to the strange way your visual system is laid out, everything to the left of where you focus is directed to the right hemisphere and vice-versa, allowing visual psychophysicists to measure hemispheric differences by presenting stimuli to the left or right visual fields. Cottrell's model (unlike previous ones) does not build in the fine detail/broad strokes difference between the hemispheres; rather, it falls out of building networks based on hypothesized connectivity differences between cortical patches within each hemisphere. In this paper, entitled "Categorical vs Coordinate Relationships Do Not Reduce to Spatial Frequency Differences," Cottrell and colleagues employ his model to explain some classic data used by Stephen Kosslyn of Harvard to argue that the right hemisphere is better at metric tasks, while the left is better at categorical tasks, including data that does not fit Kosslyn's theory. Other authors of the paper include first author CSE M.S. student Vishaal Prasad (B.S. '16, M.S. '17), Cognitive Science Ph.D. alumnus and former postdoc Ben Cipollini (now at Classy.org). This paper is also available online .

In addition to his credentials in both computer science and cognitive science, Cottrell earned his undergraduate degree from Cornell University with dual majors in two completely different disciplines: Mathematics and Sociology - reflecting his aspiration to be the first Hari Seldon, the psychohistorian hero of Isaac Asimov's Foundation Trilogy.

The Cottrell Lab
TDLC

NSF Approves Proposal for Machine Learning Cyberinfrastructure

Effective October 1, 2017, CSE Professor and Calit2 Director Larry Smarr will become Principal Investigator on a new NSF-funded, $1 million community infrastructure in support of machine learning research. Ahead of the launch, HPCwire interviewed Smarr for a preview of the proposed cyberinfrastructure, which includes two co-PIs affiliated with CSE: Professor Tajana Rosing, and CSE lecturer Ilkay Altintas, who is also SDSC's Chief Data Science Officer. A much longer list of potential users of the community infrastructure includes CSE professors Ravi Ramamoorthi, Manmohan Chandraker, Arun Kumar, Rajesh Gupta, Gary Cottrell, and others. Following is an excerpt from HPCwire's July 25 article:

CSE professor Larry Smarr, PI on NSF-funded CHASE-CI.

The ambitious plan - Cognitive Hardware and Software Ecosystem, Community Infrastructure (CHASE-CI) - is intended to leverage the high-speed Pacific Research Platform (PRP) and put fast GPU appliances into the hands of researchers to tackle machine learning hardware, software, and architecture issues.

Given the abrupt rise of machine learning and its distinct needs versus traditional FLOPS-dominated HPC [High Performance Computing], the CHASE-CI effort seems a natural next step in learning how to harness PRP's high bandwidth for use with big data projects and machine learning. Perhaps not coincidentally [CSE's] Smarr is also principal investigator for PRP. As described in the NSF abstract, CHASE-CI "will build a cloud of hundreds of affordable Graphics Processing Units (GPUs), networked together with a variety of neural network machines to facilitate development of next generation cognitive computing."

Those are big goals. Last week, Smarr and co-PI Thomas DeFanti spoke with HPCwire about the CHASE-CI project. It has many facets. Hardware, including von Neumann (vN) and non von Neumann (NvN) architectures, software frameworks (e.g., Caffe and TensorFlow), six specific algorithm families (details near the end of the article), and cost containment are all key target areas. In building out PRP, the effort leveraged existing optical networks such as GLIF by building termination devices based on PCs and providing them to research scientists. The new device - dubbed FIONA (Flexible I/O Network Appliances) - was developed by PRP co-PI Philip Papadopoulos and is critical to the new CHASE-CI effort. A little background on PRP may be helpful.

Data showing increase in PRP performance over 15-month
time period ending in April 2017.

As explained by Smarr, the basic PRP idea was to experiment with a cyberinfrastructure that was appropriate for a broad set of applications using big data that aren't appropriate for the commodity internet because of the size of the datasets. To handle the high-speed bandwidth, you need a big bucket at the end of the fiber, notes Smarr. FIONAs filled the bill; the devices are stuffed with high performance, high capacity SSDs and high speed NICs but based on the humble and less expensive PC.

"They could take the high data rate without TCP backing up and thereby lowering the overall bandwidth, which traditionally has been a problem if you try to go directly to spinning disk," says Smarr. Currently, there are on the order of 40 or 50 of these FIONAs deployed across the West Coast. Although 100 gigabit throughput is possible via the fiber, most researchers are getting 10 gigabit, still a big improvement.

DOE tests the PRP performance regularly using a visualization tool MadDash (Monitoring and Debugging Dashboard). "There are test transfers of 10 gigabytes of data, four times a day, among 25 organizations, so that's roughly about 300 transfers four times a day. The reason why we picked that number, 10 gigabytes, was because that's the amount of data you need to get TCP up to full speed," says Smarr.

Networks are currently testing out at 5, 6, 7, 8 and 9 gigabits per second, which is nearly full utilization. "Some of them really nail it at 9.9 gigabits per second. If you go to 40 gigabit networks that we have, we are getting 13 and 14 gigabits per second and that's because of the [constrained] software we are using. If we go to a different software, which is not what scientists routinely use [except] the high energy physics people, then we can get 30 or 40 or 100 gigabits per second - that's where we max out with the PC architecture and the disk drives on those high end units," explains DeFanti.

[Editor's note: To read the full original article, click here to view the complete report on HPCwire.]

CSE Ranks #14 among U.S. Programs in Computer Science and Engineering

The Computer Science and Engineering program at UC San Diego now ranks #14 in the nation, and #40 worldwide, according to an expanded subject-area ranking in the 2017 Academic Ranking of World Universities (ARWU), published by the ShanghaiRanking Consultancy. Based on five hard-data metrics, the results put the Jacobs School of Engineering among the top 20 U.S. institutions in eight subject areas, including mechanical engineering, civil engineering, bioengineering, automation and control, biomedical engineering, materials science and engineering, and electrical and electronic engineering in addition to computer science and engineering.

2017 computer science rankings from the ARWU.

The growing strength in robotics -- which includes key faculty in the CSE department, including Henrik Christensen, Laurel Riek and Ndapa Nakashole -- contributed the the school's #10 ranking in automation and control, which includes robotics. The school is a longtime leader in controls engineering, and it is home to the Cymer Center for Control Systems and Dynamics as well as the Contextual Robotics Institute (the latter launched in 2015). The robotics institute's fourth annual international robotics forum will be held on October 27, 2017. This year's theme: Autonomous Driving 2025.

The subject area rankings are based on a weighted combination of five indicators: the number of papers in top journals for that subject area; the average impact of papers from an institution in a particular subject area (on this count, CSE did better than top U.S. programs except Stanford); the percentage of papers that represent international collaborations; the total number of papers authored by an institution in a subject area; and the number of staff winning significant awards in that subject area; In the case of computer science and engineering, UC San Diego received zero points because no CSE faculty member has received what's considered to be the top honor in computer science: the Turing Award.

See all the rankings on the ARWU Subject Area Rankings website.

Fall 2017 Distinguished Lecture Series Begins to Take Shape

Looking ahead to the 2017-2018 academic year, CSE is lining up speakers for the fall Distinguished Lecture Series. With full details on the topics and abstracts still to come, the committee has already scheduled five lectures, beginning in early October. The speakers already on the calendar include:

Muddu Sudhakar

Silicon Valley serial entrepreneur Muddu Sudhakar will deliver the inaugural lecture in the series on Monday, October 2 at 11am in CSE room 1202. Sudhakar is currently VP and general manager of security and IoT at Splunk. Previously he was CEO of Caspida, a leader in next-generation cyber security and threat detection. Sudhakar is a seasoned and successful entrepreneur in Silicon Valley, and wos a VP at VMware and Pivotal from 2012 to 2014 for big data analytics and cloud services. He joined VMware in 2012 after the company acquired Cetas, a company he co-founded and led as CEO. Earlier, at EMC, Sudhakar was Chief Strategy Advisor, VP and general manager for cloud information services, and from 2003 to 2010, he was CEO and founder of Kazeon, which was acquired by EMC. He also co-founded Sanera Systems, a next-generation SAN technology company in 2009, only to be acquired four years later by McData (Brocade). He started his career at Silicon Graphics as designer of CPU and server technology, following completion of his Ph.D. in Computer Science from UCLA. Sudhakar is widely published in industry journals and conference proceedings and has more than 25 patents in Cyber Security, Big data analytics, Machine learning, Analytics, Data science, Cloud Services, Enterprise Search, Information Management, Distributed systems, Storage/Server technologies, Virtualization, Information security, Networking technology and VLSI chip design.

Nikos Kyrpides

DOE Joint Genome Institute's Microbiome Data Science head, Nikos Kyrpides, will deliver his distinguished lecture on October 16, 2017 at 11am in CSE 1202, where he'll be hosted by CSE Prof. Pavel Pevzner. Dr. Kyrpides joined the DOE Joint Genome Institute in 2004 to lead the Genome Biology Program and the development of the data management and comparative analysis platforms for microbial genomes and metagenomes (IMG). He became the Metagenomics head in 2010 and leads the Prokaryotic Super Program and the Microbiome Data Science Group since 2011. Prior to joining the DOE Joint Genome Institute, Dr. Kyrpides led the development of the genome analysis and Bioinformatics core at Integrated Genomics Inc. in Chicago, IL. He did his postdoctoral studies with Carl Woese at the University of Illinois at Urbana-Champaign and with Ross Overbeek at the Argonne National Laboratory. Krypides has earned honors including: the van Niel International Prize for Studies in Bacterial Systematics (2011-2014);American Academy of Microbiology Fellow (2014); Thomson Reuters Highly Cited Researchers (2014-2016); Academic Excellent Prize Award, Empeirikeion Foundation (2012); Outstanding Performance Award, Lawrence Berkeley National Laboratory (2007), Professor Kyrpides earned his Ph.D. in Molecular Biology and Biotechnology from the University of Crete in Greece.

Majid Sarrafzadeh

UCLA Professor Majid Sarrafzadeh will deliver his lecture on Monday, October 23 at 11am in CSE room 1202. Sarrafzadeh is a distinguished professor of Computer Science at UCLA, working primarily in mobile health and data analytics. His research interests include healthcare technology, computer system architecture, VLSI-CAD, algorithms and biomedical informatics. Sarrafzadeh was elected an IEEE Fellow in 1996. Sarrafzadeh received his Ph.D. in 1987 from the University of Illinois at Urbana-Champaign in Electrical and Computer Engineering. He joined Northwestern University as an Assistant Professor in 1987. In 2000, he joined the Computer Science department at UCLA. He is a co-founder and co-director of the Center for SMART Health. Sarrafzadeh has published more than 530 papers, co-authored five books, and is a named inventor on many U.S. patents. He has collaborated with many industries and co-founded several companies, among them MediSens Wireless, Wanda Health, Bruin Biometric and Hierarchical Design (acquired by Xilinx in 2004). Current research projects involve mobile health technologies to improve clinical management of pediatric conditions, to reduce hospital readmissions for 1,500 heart-failure patients, and to apply novel healthcare monitoring for a large population of elderly patients.

Manuela Veloso

Carnegie Mellon University Professor Manuela Veloso is scheduled to give her lecture on Monday, November 6 at 11am in CSE room 1202. Veloso is the Herbert A. Simon University Professor in the School of Computer Science and Head of the Machine Learning Department at CMU, where she has been a faculty member since earning her Ph.D. in computer science there in 1992. Veloso delivered a keynote speech at the 2015 Grace Hopper Conference on Women in Compuing. The computer scientist is renowned for her work in artificial intelligence and robotics, and with her students, Veloso studies a variety of autonomous robots, including mobile service robots and soccer robots. She is the past president of the Association for the Advancement of Artificial Intelligence (AAAI), as well as co-founder and past president of the international RoboCup Federation. Veloso was named a University Professor, the highest academic accolade bestowed by CMU, in 2014, and was honored as an Einstein Chair Professor by the Chinese Academy of Sciences in 2012. She is a Fellow of AAAI, IEEE and AAAS. She is a recipient of the Association for Computing Machinery's Special Interest Group on Artificial Intelligence, as well as a National Science Foundation Career Award and the university's Allen Newell Medal for Excellence in Research.

Ari Juels

Cornell Tech Professor Ari Juels will speak on Monday, November 13 at 11am in CSE room 1202. He joined New York City-based Cornell Tech in 2014. Previously, he was Chief Scientist of RSA (The Security Division of EMC), Director of RSA Laboratories, and a Distinguished Engineer at EMC, where he worked until 2013. Juels received his Ph.D. in computer science from U.C. Berkeley in 1996. His recent areas of interest include cryptocurrency and smart contracts, applied cryptography, cloud security, user authentication, and privacy, among other topics. He is co-director of the Initiative for CryptoCurrencies and Contracts (IC3). In 2004, MIT's Technology Review magazine named Professor Juels one of the world's top 100 technology innovators under the age of 35. Computerworld honored him in its "40 Under 40" list of young industry leaders in 2007. Juels' research interests broadly span security, privacy, and cryptography. His group is exploring blockchains, cryptocurrency (such as Bitcoin) and smart contracts; cloud security; honey objects and ways to defend from fake resources that deceive adversaries; and post-privacy systems (i.e., how to ensure fair use of data and fair decision-making).

Note: Most of the weekly DLS spots are already booked for the Fall quarter, but faculty who would like to host a Distinguished Lecture, please contact DLS committee chair Pavel Pevzner to suggest a speaker from your area.

Comic-Con Redux: CSE Professor Participated in AI Session at 2017 Event

As we reported in the last CSE Newsletter, CSE assistant professor Ndapa Nakashole participated in a panel discussion on the future of artificial intelligence. Also on the panel were Hollywood writers including Craig Tilley, who writes for and produces the TV show, Marvel's Agents of S.H.I.E.L.D. The panel drew a large crowd to the discussion about the science and science fiction of AI and where we go from here.

CSE assistant professor Ndapa Nakashole (left) and Hollywood writer Craig Tilley on Comic-Con panel

UPCOMING EVENTS

WEDNESDAY, AUGUST 2, 2017

Data-Driven Techniques for Type Error Diagnosis

Ph.D. candidate Eric Seidel is set to defend his dissertation this week in front of a committee consisting of his advisor Ranjit Jhala, fellow CSE professors Bill Griswold and Sorin Lerner, as well as Cognitive Science professors Philip Guo and James Hollan. Seidel's dissertation is on "Data-Driven Techniques for Type Error Diagnosis."

Eric Seidel

Date: Wednesday, August 2
Time: 12 noon
Location: Room 3217, CSE Building

Abstract: Static type systems are a powerful tool for reasoning about the safety of programs. Global type inference eliminates one of the prime complaints against static types, that the annotation burden is too high. However, this introduces its own problems as the type checker must now make assumptions about what the programmer intended to do. A single incorrect assumption can lead the type checker to erroneously blame an expression far from the actual error the programmer made, which can be particularly confusing for newcomers who have not yet constructed a mental model for how the type checker works. In this dissertation we present a pair of complementary techniques to localize and explain type errors, with an emphasis on the errors encountered by novice users.

We tackle the localization problem by using machine learning to learn a model of the errors made by students in an introductory course. Then, we use the model to produce a ranked list of likely error locations in new
programs. Our models can be trained on a modest amount of data, e.g. a single instance of a course, and we envision a future where each introductory course is accompanied by a model of its students' errors. To better explain the error to novice users, we present a runtime error that the type system would have prevented. We interleave type-checking and execution to search for a set of program inputs that would lead execution to a bad state, and present the execution trace to the user in an interactive debugger. This allows the user to explore why their program was rejected, and connects the dynamic (runtime) semantics to the static (typing) semantics.

We have evaluated our techniques empirically using a new dataset of ill-typed student programs collected from two instances of an undergraduate programming languages course at UC San Diego. We have also
performed user studies with novice users, comparing the output of our techniques with the state of the art in type error diagnosis. Our results show that these are practical, lightweight techniques for improving the error messages produced by type checkers.

Giving Opportunities

The Jacobs School of Engineering offers a variety of ways to support the Department of Computer Science and Engineering. As the 2016-'17 academic year gets underway, please consider giving online to the CSE Engineering Tutor Program or the Paul R. Kube Chair of Computer Science. You can also honor your favorite teacher when you donate to the CSE Teaching Endowment Fund.