Bhavya Daya





Completed Ph.D. in Electrical Engineering at MIT.

Contributed to the development of innovative solutions to problems in computing, communications and electronics.

Developed research, teaching and project management expertise required for industry, academic or entrepreneurial opportunities.




Doctor of Philosophy (Ph.D. EECS)

Department of Electrical Engineering & Computer Science.


Completed Research Qualification,  Technical Qualification

Minor Requirements.

Completed Teaching Certificate Program,

Completed Doctoral Thesis & Defense.




Bachelor of Science in Computer Engineering

(Summa cum Laude);

Bachelor of Science in Electrical Engineering

 (Summa cum Laude); 

Master of Engineering in Computer & Electrical

 Engineering (High Honors);


Mastered hardware and software concepts related to

 computer and electrical engineering with the completion

 of the dual degree in Computer and Electrical Engineering.

 Increased understanding of high performance computer

 architecture, high performance computer communications,

 and VLSI design and technology with the enrollment

in the combined degree program.

Participated in IPPD and REU for industrial project

and research experience respectively.



High School Honors Diploma   



SC2EPTON: High-Performance and Scalable, Low-Power and Intelligent, Ordered Mesh On-Chip Network

  •  Over the last few decades, hindrances to performance and voltage scaling, led to a shift from uniprocessors to multicore processors, to the point where the on-chip interconnect plays a larger role in achieving the desired performance and power goals. Shared memory multicores are subject to data integrity concerns as each processor computes on data locally, and is possibly unaware of external modifications. Hardware cache coherence addresses this problem, and provides superior performance to software-implemented coherence, but is limited within practical constraints, i.e. area, power, timing. Scaling coherence to large number of cores, presents challenges of unscalable storage, high power consumption, and increased onchip network traffic. To target the three challenges; SC2EPTON Architecture was invented.
  • SC2EPTON Architecture is a low-power bufferless architecture capable of high performance communication for snoopy coherence over an ordered mesh network. Key Ideas : (1) Message delivery is decoupled from the ordering, allowing messages to arrive in any order and at any time, and still be correctly ordered. (2) An idle, preset, multihop path can be traversed by a flit that did not explicitly reserve it. (3) Dynamically determine the time-division-multiplexed access for the ring, allow simultaneous ring access for non-contending sources, and set router control signals, all locally at each node.


  • Computer Organization; Digital Logic; Digital Design; Microprocessor Applications; Computer Architecture; Parallel Computer Architecture;

  • Electric Circuit; Microelectronic Circuit; Signals & Systems; Communications Systems & Components; Computer Data & Wireless Communications;

  • Advanced Programming Fundamentals; Applied Discrete Structures; Data Structures & Algorithms; Operating Systems; Software Engineering.

  • VLSI design & Technology, RF Integrated Circuits & Technology;

  • Advanced Calculus; Differential Equations; Linear Algebra; Engineering Analysis; Engineering Statistics;

  • Elements of Machine Intelligence 

  • Technical Writing;  




    Basic, C, C++, Java, MIPS, VHDL, Verilog, MATLAB, LISP.




   Digital Logic & Design:   PSpice, Altera Quartus, Xilinx ISE

   Embedded System Design:   Altera Nios II, Xilinx Platform Studio

   Computer Aided Design:   PRO/Engineer,Cadence

   Software Engineering:   Visual Studio.

   Computer Networks:   Berkeley Socket API,  Network Simulator

   Operating Systems:   DOS, Windows, Unix, Minix, µCLinux.

   Productivity:    MS Office, Adobe

   Engineering Statistics:   Minitab



Test Fixture Automation for Meniere’s Disease Treatment DeviceMedtronic XOMED,  Jacksonville FL : 2007-2008    

  • Participated in a multidisciplinary practice-oriented program providing an entrepreneurial approach to integrated process and product design (IPPD)  of an industry sponsored project.

  • Responsible for software development and software-hardware integration.


Rapid Prototyping of Embedded Systems Using FPGAs: 2007-2008

  • Reviewed developments in embedded system design and future trends. Explored board-level rapid prototyping of an embedded system using FPGA.

  • Used Altera’s DE2 Educational & Development Board with Cyclone II FPGA. Used Xilinx’s PowerPC & MicroBlaze Development Kit with Virtex-4 FPGA


Low-Power Configurable Logic Block Design for FPGAs

(Fall 2009)           

An island-style FPGA structure is investigated and the CLB is explored in order to reduce its power consumption. Approximately 16 percent of the FPGA power is consumed by the CLBs alone. As technology nodes scale down, the leakage power is going to increase in these logic blocks. Also, its effects on the total power should not be ignored for FPGA cores that can be embedded within an ASIC architecture. This project explores methods for reducing the power consumption of each CLB as compared to a standard CLB.

Microprocessor Thermal Analysis using the Finite Element Method

(Spring 2010)

The on-chip temperature is a concern because the reliability and performance can be degraded due to hot spots. Thermal modeling of the chip will allow the designer to view the hot spots and adjust the architecture to obtain reliable chip architecture. In order to meet the performance demands of the current consumer market, the trend towards multicore processors is causing thermal effects to become increasingly important. The method of thermal analysis implementation is evaluated to determine the benefits of the different approaches. The finite element analysis was ultimately chosen and used to perform a case study on a microprocessor, and to evaluate different floorplans for multicore processors.

Ultra Low Power Electrocardiogram Amplifier Design

(Fall 2011)

An effective EKG amplifier should not only amplify small signals but should also be immune to 60 Hz noise. It was designed using instrumentation amplifier - a self-biasing cascade non-inverting differential amplifier with a telescopic current mirror for the first stage. The first stage combated flicker noise and allowed for a high gain and CMRR. The second stage consisted of the difference and common mode feedback amplifiers designed as cascoded current mirror op-amps. The telescopic current mirror provided high gain, low bias current, and small output swing. The cascoded current mirror provides high gain, higher output swing, and twice the bias current. The total power of the amplifier was only 1.83 µW.

Other Projects

The development and optimization of a cycle-accurate pipelined instruction set simulator for a custom instruction set.
Analysis of Combined Bimodal and GShare Branch Prediction Schemes
Synchronous 16 X 8 SRAM Design
All Digital Phase Locked Loop Design and Implementation.
Super-heterodyne FM Receiver Design and Simulation.
Network Security: History, Importance and Future.
Reliable Broadcast in Error-prone Multi-hop Wireless Networks : Algorithms and Evaluation.



NSF Center for High-Performance Reconfigurable Computing. 

University of Florida,  Gainesville, FL


  • Digital design and implementation of a virtual architecture for partial reconfiguration using Xilinx Virtex-4 FPGA and Embedded Development Kit.

  • Model VHDL architectural section for simple sequential and combinatorial logic components of the intercommunication architecture.


OMNI : Ordered Mesh Network Interconnection for Multicore Processors



Research in the area of on-chip networks is reaching a critical point. New technologies and protocols need to be developed for continued growth. With the increasing number of cores within a CMP, a bus will suffer scalability limits. The research, design, and development of an ordered mesh network interconnect is of utmost significance because it satisfies the latency, bandwidth and power requirements. The network itself is used to maintain sequential consistency, simplifying the cores and making the interface smarter


IAP 2011

6.099 Microelectronic Devices & Circuits


The review course emphasizes microelectronic device modeling and basic microelectronic circuit analysis & design.


It covers Semiconductor Basics, 5 Basic Equations,

Carrier Injection;  Flow Problems,  Junction Diodes,

 Bipolar Junction Transistors,  MOS Capacitors,

 MOSFETs, Linear Equivalent Circuits,  Inverter Basics,

  CMOS , Linear Amplifier Basics, Differential Amplifiers

 and Analysis, Multistage Amplifiers and High Frequency

 Analysis of Linear Amplifiers.


IAP 2012 (Scheduled)

6.190 Rapid Prototyping of Embedded Systems Using FPGA



This laboratory course, first to be offered during IAP 2012, would have exposed senior undergraduate and graduate students to rapid prototyping of embedded systems using FPGA based development and education board. It would have integrated their knowledge of digital logic, programming and systems design.



WTP - Electrical Engineering Instructor


The MIT Women's Technology Program (WTP) in EECS is a four-week summer academic and residential experience where female high school students explore engineering and computer science through hands-on classes, labs, and team-based projects in the summer after 11th grade


SUMMER  2011

INTEL : Many-Integrated Core Division of VPG

Hillsboro, OR


The division works on the development of high performance multicore processors with graphics capabilities. The project emphasis was on the memory execution cluster within the processor core. Due to the strict timing requirements and area constraints, careful design decisions were made before the implementation could proceed. Analysis and implementation of a performance model for cache and TLB locking was performed. The unit that communicates between the memory cluster and the graphics unit or vector processing unit, was understood to determine the necessary memory ordering changes required. The memory ordering is crucial for correct functionality. Corner cases had to be investigated and an implementation was performed along with the required validation.

SUMMER  2012

INTEL : Many-Integrated Core Division of VPG

Hillsboro, OR

Continuation of internship involved the design and implementation of scatter-gather feature in a high performance CPU - taking into account the architecture design, interfaces, timing, synthesis, and area considerations. The complexity of the CPU required (a) in-depth understanding to identify corner cases and verification, (b) extensive oral and written communication between different teams and experts, and (c) biweekly deadlines to ensure team productivity and effective time management. 


Bhavya Daya,  

"Parallelization of Two-Dimensional Skeletonization Algorithms",  Journal of Undergraduate Research, University of Florida,

  Volume 9, Issue 4, Summer 2008


Sunghyun Park, Tushar Krishna, Chia-Hsin Owen Chen, Bhavya Daya, Anantha P. Chandrakasan, and Li-Shiuan Peh ,

"Approaching the Theoretical Limits of a Mesh NoC with a 16-Node Chip Prototype in 45nm SOI" 49th ACM/EDAC/IEEE Design Automation Conference (DAC),  Jun 2012


Bhavya Daya, Chia-Hsin Owen Chen, Suvinay Subramanian, Woo-Cheol Kwon, Sunghyun Park, Tushar Krishna, Jim Holt, Anantha Chandrakasan, Li-Shiuan Peh,

 "SCORPIO: A 36-Core Research Chip Prototype Demonstrating Snoopy Coherence on a Scalable Mesh NOC with In-Network Ordering"  41st International Symposium on Computer Architecture (ISCA) , June 2014          


Chia-Hsin Owen Chen, Sunghyun Park, Suvinay Subramanian, Tushar Krishna, Bhavya K. Daya, Woo-Cheol Kwon, Brett Wilkerson, John Arends, Anantha P. Chandrakasan, and Li-Shiuan Peh, “SCORPIO: 36-Core Shared-Memory Processor Demonstrating Snoopy Coherence on a Mesh Interconnect, Hot Chips 26: A Symposium on High Performance Chips, Aug 2014.


Bhavya Daya,  Li-Shiuan Peh, Anantha Chandrakasan,

" Towards High-Performance Bufferless NoCs with SCEPTER"  IEEE Computer Architecture Letters,

Issue 99,  May 2015


Bhavya Daya,  Li-Shiuan Peh, Anantha Chandrakasan,

" Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling",

53rd ACM/EDAC/IEEE Design Automation

 Conference (DAC),  June 2016

(Nominated for Best Paper Award)





 Fundamentals of Engineering Exam (FE) Passed,  EIT Certification, Massachusetts , December 2009



IBM SOI Designer Training Workshop

Jan 2010 - Burlington, VT


Digital Integrated Circuits - MIT Circuits Seminar

Dec 2009 -  Cambridge, MA


Numerical Methods for PDEs  - MIT Math Seminar

May 2010 -  Cambridge, MA


"Reconfigurable Network Interface for  Multicore Processors"

 MARC 2011 , Jan 2011 - Cambridge, MA


"Towards the Ideal On-chip Fabric for 1-Many and Many-1 Communication" & "Approaching the Theoretical Limits of a Mesh NoC with a 16-Node Chip Prototype in 45nm SOI"

GSRC/MySyC Annual Review

Oct 2012 - Berkeley, CA


Center for Future Architectures Research

 Kick-off, University of Michigan

March 2013, Ann Arbor MI


"SCORPIO: Snoopy Coherent Research Processor with

Interconnect Ordering"

Center for Future Architectures  Research

Annual Review, University of Michigan

October 2013, Ann Arbor MI


"SCORPIO: Snoopy Coherent Research Processor with

Interconnect Ordering"

MARC 2014, Jan 2014 - Brettonwoods, NH.


"SCORPIO: A 36-Core Research Chip Prototype Demonstrating

Snoopy Coherence on a Scalable Mesh NOC

with In-Network Ordering"

ISCA 2014, June 2014, Minneapolis, MN


"SCORPIO : 36-Core Shared-Memory Processor

with a Coherent Mesh "

HC26 Program, HOT CHIPS,  August 2014, Cupertino CA


"Distributed Global Ordering in a Mesh Network :

Architecture and Application "

E-Workshop Series, CFAR 2014,  August 21,



" Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling",

53rd ACM/EDAC/IEEE Design Automation

 Conference (DAC),  June 2016



Wiley Publisher :  January 2012


Computer Architecture Letters

IEEE : June 2014         

IEEE : December 2014