Skip to main content

SHAKTI Processor Program

The SHAKTI processor program at IIT-Madras has been kind of an official program for a while now but has not been  publicized since we figured it made more sense to wait till we had some code in place.We finally made our code code-line public yesterday, so do go take a look at

bitbucket.org/casl/shakti_public

Our RapidIO interconnect program and our lightstor SSD controller are also in the casl directory. 

Currently we have released a 64 bit dual issue, OO core. This is the I-Class core, one of the six+ families of cores we plan to release. See 

 http://rise.cse.iitm.ac.in/shakti.html

This is just a base core and not a full SoC. A brief run-down of the features of the core


  1. 64-bit, dual issue variant of Shakti processor family. Features a pipeline of depth of 8 stages. The 8 stages of the pipeline are - Fetch, Decode, Rename(Map), Wakeup, Select, Drive, Execute, Commit. Each of the pipeline stages takes a single cycle to execute.
  2. The core supports all standard instructions of the RISC-V ISA for RV32I, RV64I and RV32M modes (Atomic instructions will be added by Christmas)
  3. Each instruction has its operands renamed in-order but issued out of order to the execution units. Commit is in-order.
  4. Register Renaming is done through a merged register file approach. Merged register files store both the architectural register values and speculated values. The number of architectural registers are 32 and the number of physical registers are 64. A buffer (register alias table) maintains the architectural register to physical register map.
  5. The Branch Predictor is a Tournament Branch predictor. It has Bimodal and Global predictors. More advanced BP schemes will be added shortly.
  6. The Functional Units are parameterised. The current design uses 2 Arithmetic and Logical Units, 1 Branch Unitand 1  Load Store Unit.
  7. The I-Cache and D-Cache use a PIPT scheme and will be changed to a VIPT scheme once the MMU is added. Each cache is nominally configured at 32KB. The cache is fully parameterised in terms of the size of the cache, associativity, number of blocks within a cache line, number of sets, etc. The caches are implemented using BRAMs provided in the Bluespec library. These BRAMs have a direct correlation to the FPGA based Block RAMs, easing the translation to an FPGA based design.
  8. This variant supports only the machine mode.

Comments

Popular posts from this blog

The myth of Chinese Mobile Phone Supply Chain

So lots of folks want to dump Chinese phones for a variety of reasons, I am not getting into the validity of the reasons. But I do advise the  Indian security establishment occasionally, so my cognitive biases are clear ! Someone who does not have an  in-depth knowledge of a mobile phone supply chain should not really comment on this issue. I am not being arrogant here, it is just that it is a complex issue and while anyone can understand the intricacies of the supply-chain, you need to put in the effort to know the subject. I have been designing and setting up mobile phone supply chains for about 2 decades now, so have been around the block. India was actually designing high end mobile phones (by 2002) before the Chinese, a fact that is not common knowledge. So it is not as if the supply chain knowledge does not exist here. On to the present  ... 1. The core semicon part of the phone - Processors, DRAM, NAND, SPI Flash, Camera Sensor, Radio, Power. Not aware of ...

Defence Computing Standardization

Standard Computing systems for Defense Applications Introduction With the advent of Indian CPU designs, it is now possible to define standard SoCs for a wide range of defense applications, leading to lower acquisition costs due to standardization and having designs tailored for defense applications. 4 standard configurations, D1, D2, D3 and D4 will cater to more than 75% of CPUs used in the strategic sector. D1-D4 will be class standard specs and variants can be derived from them for specialized applications while still keeping the base class design intact. This allows custom designs to be realized quickly and with lower cost compared to a full custom design that cannot leverage existing designs. It is also necessary that the computing systems and form factors also be standardized so that standard LRUs can be used across various systems. These will broadly fall into two categories Single board computers Backplane based systems The cabling standards between systems also has to be stand...

Telecom Travails

 Telecom Travails There are few times in my life when I am at a loss for words and folks who know me will attest to the fact that those moments are exceedingly rare ! Talking about the state of the Indian Telecom industry is one such situation. But writing a blog requires words to be put on paper, so here goes .... To say that the industry is in a mess is to state the obvious. If as a scientist I were to apply the principle of Occam's razor, the simplest explanation is  that Telco  execs  have strong suicidal tendencies and the state of the companies they run can be said to validate the thesis. But a lot of them are good friends of mine and are intelligent, capable and dedicated people. So while the thesis is not a bad one, evidence suggests the cause lies elsewhere ! As with all human  tragedies, the causes are manifold and as is  always the the case, the principal actors had the noblest of intentions. But you know what they say about the road to he...