新闻分析 热点文章 DSP产品 精彩专题 资料下载 信息发布 产品评测报告 市场分析报告 会员专区 聚合
当前页面位置:DSP Watch > 资料下载 > 芯片设计 > Configuring a VLIW-DSP core for application specific requirements
Configuring a VLIW-DSP core for application specific requirements
类型: 作者: 最后更新:2004-3-31 21:53:53 推荐指数: 2779
46 International IC – Taipei • Conference Proceedings
Configuring a VLIW-DSP
core for application specific
requirements
Oz Levia
Chief Techology Officer
Improv Systems Inc.
Abstract
In this paper we describe an architectural approach to configurable
and scalable Very Long Instruction Word (VLIW) DSP core for
embedded systems. We focus our presentation on an actual
experience with a specific configurable VLIW DSP core - the
Jazz DSP processor. Following an introduction to the processor
and the VLIW structure, we discuss the methodology and tools
required for application specific configuration of the processor.
Introduction
Configurable VLIW Processor
In recent years, very long instruction word (VLIW) approaches
have become increasingly prominent in the high-end DSP space.
The reasons for this are straightforward; VLIW provides parallel
execution of operations to significantly increase performance.
Unlike, superscalar approaches, the overhead of determining this
parallelism is paid in the compiler rather than each time the
application is run. The ‘price’ to be paid for performance comes
in the width of the instruction and the resultant potential increase
in the memory image for a given application.
The Jazz DSP core
Unlike many VLIW processors, the Jazz processor does not
aggregate computation units (ALU, multiplier, shifter) into a
single data path but provides a flat collection of computation
units. This allows the compiler to use the computation units to
their best advantage in each instruction. Also, the Jazz processor
is part of Improv’s general programmable system architecture
(PSA), which provides a unique approach to combining multiple
processors into a single structure. One aspect of this approach is
the ability to attach multiple memory ports into each Jazz
processor.
The Jazz DSP core is a Flexible VLIW that delivers high
performance with low power consumption. As could be seen in
Figure 1. Jazz has an array of computational Units (CU) and an
array of Memory Interface Units. A task Control unit is used to
control task queuing and execution. Instructions are fetched from
a program address space. Jazz is designed to deliver high
performance at moderate clock speeds through the use of
parallelism. As a result, the HW design is relatively simple and
power consumption is very low.
Figure 1: The Jazz DSP Core
A flexible DSP Core
Like configurable RISC processors, designers working with
configurable VLIW processors can achieve significant gains
by adding custom logic into the data path. However, the VLIW
approach offers significant opportunities for designers above
and beyond those afforded by configurable RISC processors.
The possible opportunities for configuring a VLIW processor
include:
* Defining the collection of Computational Units (CU) in the
processor (ALUs, MACs, etc) that can operate in parallel
each cycle.
* Ading custom CU into the processor for acceleration of common
or critical program part.
* Configuring the VLIW instruction to tradeoff parallelism
for instruction word width
* Changing the number of Memory interface Units for variable
memory accesses in and out of the processor datapath
each cycle.
* Modify other aspects of the processor to trade off power
and performance with area. For example: number and location
of registers, task queue depth, processor data connectivity
and more.
Mix and Match Computation Units
To increase performance with configurable processors, the
general belief is that the designer must add custom logic and
instructions. However, with Improv’s Jazz processor, the
International IC – Taipei • Conference Proceedings 47
designer can increase performance without any hardware design.
This is achieved by creating different combinations of
computation units in the processor to create a mix that is
specifically tuned to an application domain.
The Jazz processor can contain multiple computation units
including ALUs, MACs, and shifters. Improv provides a robust
collection of these computation units in its base offering.
Designers can define the collection of computation units in the
processor to change the number and type of operations that can
be executed each instruction. For instance, a designer might
want to create a processor with 3 ALUs, 1 shifters and 1 MAC
for ALU intensive application domains or create a processor
with 2 ALUs, 2 shifters and 2 MACs for more MAC-intensive
and balanced application domains.
Designer-Defined Computation Units
For most applications, combinations of general-purpose
computation units can provide enough performance. However,
for very high-performance applications like network processing,
multi-channel speech processing and image/video processing
it can be important to find every opportunity to increase
performance while maintaining programmability. Designers can
analyze applications and identify critical, high impact operations
that can be implemented in custom logic and added into the
processor.
In the Jazz processor, designers can define and insert their
own custom computation units called designer-defined
computation units (DDCUs). DDCUs are defined as a set of
operations and resources to the compiler (controlled -template
based Verilog code is also supported for Hw implementations).
The compiler binds specific operations to available resources
allowing the designer to continue to use high level programming
with out any machine specific code.
For example, consider an application that can be accelerated
by adding an operation to perform 5-bit addition. The designer
could create a custom unit to perform this operation and add it
into the processor. However, it is much easier to add the same
operation and additional logic to the pre-defined ALU
computation unit. The ALU unit has a number of operations
that it supports already and the designer simply maps those
operations plus the new 5-bit addition operation to the new unit.
Now the user can include the new unit in the processor but this
unit can also be used to support standard ALU operations as
well.
Using this feature user of Jazz can create CU to accelerate
critical parts of an application with out giving away the ability
to use the compiler and other analysis tools.
Select Bandwidth to Memory
VLIW offers performance through parallelism, but multiple
operations per cycle require bandwidth to and from memory to
match the computational bandwidth. Designers can add or
subtract MIUs and can select from a set of MIU that have
different capabilities. For example, R/W access, Byte access,
Wait state support, and others. Designers can also create this
own MIU.
Instruction Word Configuration
VLIW offers significant performance opportunities. However,
for some applications the tradeoff between the size of the
instruction word and potential performance needs to be
considered. Improv’s Jazz Composer allows the designer to
define the number of slots available in the instruction for
computation units and then assign one or more computation
units into each slot. This allows the designer to populate the
processor with a generous mix of computation units without
paying a high price in instruction width. It also means that the
designer can configure a RISC-like processor by overlaying
multiple computation units into a single slot in the instruction.
Application Specific Configuration: Design methodology
The unique strength of the Jazz Processor is in the close
cooperation between the configurable core and the programming
tools that support the processor. The design methodology is
iterative. It is described in Figure 2.
Figure 2: Design Methodology
As could be seen, application code is compiled using Solo -
the Jazz Compiler - and the results are analyzed using a mix of
static ad dynamic measurements. Feedback is that used to
modify the application (optimize) or the processor
(configuration). The tools adjust to processor configuration by
reading and processing a platform configuration file.
Jazz Composer
The Jazz processor is configured using a graphical tool (see
Figure 3.) called the Jazz Composer that provides an intuitive
drag-and-drop facility. The designer can configure specific
characteristics of the base processor structure including data
width of the processor, number of constant registers and depth
of the hardware task queue. Similar features are available in
most configurable processors. Jazz Composer takes configurable
processing to a new level by allowing the designer to address
all of the opportunities discussed earlier.
48 International IC – Taipei • Conference Proceedings
Figure 3: The Jazz Composer GUI
The Solo Compiler
The most critical tool in the methodology is the compiler. To
maintain time-to-market advantage, designers must be able to
stay with high-level language programmability. For VLIW, the
compiler is even more critical because of the complexity of
managing parallel data path elements, multiple memory accesses
and distributed register systems. Improv’s compiler maps the
operations used in an application onto a target processor by
matching each operation to a computation unit that supports
that operation. Improv’s advanced VLIW code generation
manages data movement through the concurrent data path,
parallelization of operations and resource management.
Author’s contact details
Oz Levia
Improv Systems Inc.
1485 Saratoga Ave., #100
San Jose, CA 95129
Phone: 1-408 517 4790
Fax: 1-408 517 4799
Email: ozl@improvsys.com
International IC – Taipei • Conference Proceedings 49
Presentation Materials
50 International IC – Taipei • Conference Proceedings
International IC – Taipei • Conference Proceedings 51
52 International IC – Taipei • Conference Proceedings
International IC – Taipei • Conference Proceedings 53
54 International IC – Taipei • Conference Proceedings
International IC – Taipei • Conference Proceedings 55
资料来源:
Google
 
Web dsp.blueidea.com
本站声明: 本站所有的文章和下载资源均为个人开发者提供,如有企业用于商业用途,由此引发的法律纠纷本站及站长将不负任何责任。如有任何问题, 请联系我们

相关文章


推荐文章

· Re-Usable Low Power DSP IP emb
· Ten Steps to a Successful DSP
· Design Of Future Systems
· Configuring a VLIW-DSP core fo
关于我们 | 广告服务 | 站点地图 | 联系我们 | 投稿指南 | 程序支持
友情链接: 61IC中国电子在线 | 老古开发网 | 周立功单片机 | IC商贸网 | 电子产品世界 | 中电网 | 中国电子顶级开发网
中国EDA技术网 | EDA专业论坛 | 中国电子商贸网 | 国际电子网 | 中发网 | 中国电子工程师社区 | 北极星电技术网 | 21IC中国电子网
网络平台由蓝色理想提供 意见信箱 欢迎您的咨询、留言、建议和意见
若发现页面中有任何错误或侵犯您的版权,请来信联系我们: dspwatch AT gmail.com
Copyright © 2003 - 2007 DSP Watch, All Rights Reserved 版权所有 | 京ICP备05002321号