3rd Internal Conference on Computer Architecture (ICCA'02)

3rd Internal Conference on Computer Architecture (ICCA'02)

Anf. A2, Dep. Inf., Univ. Minho
Campus Gualtar, Braga, PORTUGAL

Editor: Alberto José Proença (Editor's Message)

Programme - 28 Jan'02

9:00 Session 1
CPU Performance and Competition

11:30 Session 2
Challenges of Virtual Worlds

14:30 Session 3
Embedded and Novel System Approaches

16:45 Session 4
Parallel and Distributed Environments

Instructions for authors

Session 1

CPU Performance and Competition

Chairman: António Manuel Pina

9h00m - Welcome Session

Alberto José Proença

09h15m - Battle against Intel CPU Monopoly moves from Lower Costs to Faster IA-32 Processors (0915FasterIA32.pdf)

Giovana Mendes

Abstract: The battle among IA-32 processor’s manufacturers goes on, with Intel launching the new Pentium 4, AMD with the Athlon, and, a little more modest, Cyrix with its C3 Samuel 2. This communication aims to analyse the novelties in these launchings, namely in multimedia technologies - SSE2 and 3Dnow - and at the micro-architecture level - Netburst and Quantispeed. Remarks on the cost/performance ratio close the communication.

09h30m - Analysis of the War between Intel, Cyrix and AMD

António Manuel L. Rebelo

Abstract: Among the several manufacturers of Intel ix86 processors, three stand out for their impact on the market: Intel, Cyrix and AMD. They battle amongst themselves, for the best processor and the fastest processor for the same competing segment. In this communication, we will analyse what they do to cache memory and the clock to have the results that state and if all the three manufacturers make the same.

09h45m - Cache: why Level it? (0945CacheLevel.pdf)

Nuno Miguel D. C. Dinis

Abstract: Processing speed is faster every day; however all other components are staying behind in this run for speed, particularly access times to the memory hierarchy. Cache memories play an important role in this gap. This communication gives some clues how to improve cache performance, and debates the alternative of larger versus layered caches, based on spatial locality, latency, cost, miss rate and efficiency.

10h00m - The Gap between Processor and Memory Speeds (1000Gap_Proc-Mem_Speed.pdf)

Carlos Alberto G. Carvalho

Abstract: The continuous growing gap between CPU and memory speeds is an important drawback in the overall computer performance. Starting by identifying the problem and the complexity behind it, this communication addresses the recent past and current efforts to attenuate their disparity, namely memory hierarchy strategies, improvement of bus controllers and the development of smarter memories. This communication ends by pointing directions to the technology evolution for the next few years.

10h15m - Crusoe – An Approach for the New Era Computing (1015Crusoe.pdf)

Vasco Nuno B. C. Miranda

Abstract: In a era where mobile computing becomes imperative, this communication looks for solutions that may turn it technically available. In this context, the capacity of integration and its behaviour with overheating are the main goals or purposes. This communication makes an incursion into the technology behind this new Crusoe processor, which attempts to solve some of these problems and sets new standards for processors design.

10h30m - Strategies to Improve Performance in the IA64 Architecture (1030Perf_IA64.pdf)

Guilherme António Teixeira

Abstract: The Intel Itanium® 64 bit architecture is designed to reach high levels of performance using two main goals. The use of explicit parallelism (EPIC), where the compiler is able to express inherent parallelism in the instruction sequence, and the improvement of cache and register file use, to decrease the data miss penalties. However, to take real advantage of the model, compilers must be radically changed and optimised to use predication, speculative execution, and a set of novel features. This communication explores how these main goals of the IA64 can be feasible, and how they were architectural implemented, turning them a potential solution to improve performance.

10h45m - Improve Computing Performance, without any Hardware Update (1045PlusPerf-NoNewHw.pdf)

Carlos Jorge F. Lopes

Abstract: Computer performance is an important issue that must be taken into account by the programmers when they develop robustness and consistent solutions to apply in professional applications. Usually the performance problem is "solved" with a hardware upgrade, instead of using optimization techniques, during the development process, which helps the compiler on is own optimization. Programmers must know how compilers work, to help their optimization tasks. Cache memory also plays a major role on this issue. This communication focus on some optimization techniques that can be applied on most common problems faced on computer performance, and also attempts to draw some conclusions on the more adequate method to meet the performance goals, either by the programmer or to leave it to the compiler.

11h00m - Coffee break

Session 2
Challenges of Virtual Worlds

Chairman: João Luís Sobral

11h30m - Challenges in the Architecture of Electronic Games (1130ChallenGamesArch.pdf)

Ricardo Alexandre G. C. Martins

Abstract: This communication presents topics about the videogames consoles from the past to the present. A reference is made to show each evolution in some aspects - video, graphics, performance - and gives a glimpse of what the future will bring.

11h45m - Gekko, Emotion Engine and Intel PIII Architectures - Fighting for the Gamer (1145Fight_for_Gamer.pdf)

Eurico Alexandre T. Borges

Abstract: This communication addresses the current state of the art on electronic games consoles and the architecture behind them. The presented topics include both an overview and detailed analysis of the current top three gaming systems. . It compares the technology and solutions used by the consoles manufactures, with emphasis on the consoles CPU, in the quest for the home entertainment market. It shows why these systems should not be considered the poor relative of mainstream computer systems. Finally it identifies some of the driving forces for the evolution of electronic games and gives some hints on what can be expected in the future.

12h00m - Graphics Cards: Present, Past and Future Trends (1200GraphicCards.pdf)

João Manuel E. D. Andrade

Abstract: In the late few years, with virtual reality, multimedia and especially interactive entertainment, i.e. games, the need for massive, on time, 3D graphics had a tremendous increase. The roles of graphic cards became much more important and evolved into highly efficient processing engines that can now be viewed as highly specialized co-processors whit its own big processing and data feeding challenges. The balance between what is done by the processor or the graphic card, the use of “brute force” versus more efficient geometrical and tile-based algorithms, the huge impact of memory bandwidth and the overall platform integration, all this in order to deliver the best frame rate with optimal quality, are the issues in today’s and near future graphic processing. This places an issue also in graphical benchmarking, as the evaluation of graphic cards - both the image quality and their performance - becomes increasingly relevant to find where the real trends are, and to distinguish marketing from real cutting edge solutions.

12h15m - Silicon Virtual Machines (1215SiliconVirtualMachines.pdf)

Alberto Manuel B. Simões

Abstract: This communication looks into some virtual machine implementations, from the common stack-based architecture to the newest register-based one. It evaluates the feasibility of their implementation on silicon and how this approach can improve performance while maintaining language flexibility. Silicon virtual machines are not successful in the CPU market, mainly due to the inherent limitations of a virtual machine architecture: these are developed too centered on a language, making them very efficient for that language and very slow executing applications developed for other languages. This communication concludes with the statement that the best virtual machine we can make is a simple processor, similar to the ones we use nowadays.

12h30m - Lunch

Session 3
Embedded and Novel System Approaches

Chairman: João Miguel Fernandes

14h30m - Embedded Systems Architecture (1430EmbSysArch.pdf)

Sérgio de Jesus D. Dias

Abstract: When a computer is an integral part of a larger system, where it intensively interacts with the surrounding environment and frequently deals with real-time constraints, it s considered an embedded system. This communication describes the embedded system world: architecture, classification, characteristics, system limitations, and trends. It also discusses architectural features that cannot exist in hard real time control.

14h45m - RISC vs. CISC: the Post-RISC (1445RISCvsCISC.pdf)

Vasco Nuno C. Santos

Abstract: This communication analyses the birth of RISC and CISC architectures and their evolution over the past 20 years. It gives an overview of the competition between both to win the performance race, by adding new features, mainly from the opposite side and in the end converging into what is now known as Post-RISC era. The communication complements this issue by taking a brief look into novel VLIW processors, Transmeta Crusoe and Intel Itanium with the new EPIC instruction set architecture.

15h00m - Is there any Chance for Reconfigurable Processors Today? (1500ReconfigProc.pdf)

Nelson Ezequiel F. Nunes

Abstract: Reconfigurable processors are an emerging alternative to the increasing complex and fixed ISA (Instruction Set Architecture) approach: they contain hardware resources that are not pre-allocated to the execution of pre-defined functions, but instead can be configured with some logic commands. This communication shows the main differences between the traditional processor architecture and the reconfigurable processor, with some reference to their internal organization, performances, compatibility and to companies that are pursuing this novel approach. This communication also points out the motivations behind the current availability of commercial reconfigurable processors.

15h15m - Systems on Chip: Evolutionary and Revolutionary Trends (1515SOC.pdf)

Arnaldo Rui N. V. E. Brás

Abstract: Emerging computational needs imposed by the market lead the major industries to adopt the “systems-on-chip” design. This paper briefly demonstrates the whys and the how's associated with this activity. It shall be outlined the major issues that must be accomplished in the process of designing, verifying and publishing those systems. Finally it will establish a link between these techniques and the latest achievements in computer science.

15h30m - Pervasive Computing: Architectural Challenges (1530Ubicomp.pdf)

Victor Manuel S. Coelho

Abstract: Computing anytime everywhere - the third wave of computing - is still far from achievement. This communication addresses the on-going research on what is considered to be the two step stones of this third generation: wearable computing and ubiquitous computing, and the efforts to achieve interoperability among these devices. A brief introduction is made to the human-computer interface (HCI), as a method to reduce the problem of user input in wearables, augmented reality (AR), which provides the most common everyday object with sensing, processing and communication capabilities, and context awareness, that allow to choose a certain behaviour at a certain time. Then, based on this analysis, we will explain why these two models will come to be the basics of ‘computing everywhere’.

15h45m - Simultaneous Multithreading: A Platform for Next Generation Processors (1545SimultMultiThread.pdf)

Paulo Alexandre V. C. Assis

Abstract: Aiming high performance, contemporary computers systems statically partition processors resources, taking advantage of either instruction level parallelism (ILP) or thread level parallelism (TLP). Simultaneous Multithreading (SMT) explores parallel processing on an alternative architecture: it inherits from superscalar architectures the ability to issue multiple instructions per clock cycle, while it can simultaneously execute several programs or threads at once, a concept borrowed from multithreaded processors. The result is a processor that can simultaneously issue multiple instructions from multiple threads, capable of adapting to dynamically changing levels of ILP and TLP in a program.

16h00m - Intelligent RAM: a Radical Solution? (1600IRAM.pdf)

João Paulo P. Araújo

Abstract: The goal of an "intelligent" RAM is to design a cost-effective computer by including processing capabilities into a memory device, as an alternative to the current processor chips that contain part of the memory hierarchy on-chip. This merge leads to a reduction in memory latency, an increase in memory bandwidth, and an improvement in energy efficiency, and it also supports a more flexible selection of memory size and organization. This communication reviews current approaches and their most important features, exploring their opportunities and challenges.

16h15m - Coffee break

Session 4
Parallel and Distributed Environments

Chairman: Luís Paulo Santos

16h45m - DSM Systems Design Issues (11645DSM-DesIss.pdf)

Fernando da Gama Vieira

Abstract: In recent years efficient use of Distributed Shared Memory (DSM) systems became a major R&D topic. This communication explores and assesses some of the approaches being proposed, namely how key issues such as data consistency or granularity are addressed. This communication also discusses the impact of DSM managing techniques on applications performance, and the weaknesses and strengths of these techniques, suggesting the type of applications that are best suited to take advantage of DSM systems.

17h00m - High Throughput Computing: stealing Unused Cycles (1700HTC.pdf)

Luís Filipe R. M. Ferreira

Abstract: High Throughput Computing systems (HTC) enable otherwise idles cycles to be available to computations that involve many independent tasks. In a Distributed Computing environment, distributed ownership of computing resources is the major obstacle HTC has to overcome to take advantage of under-utilized systems in the network. This paper addresses HTC and how it fits in current parallel and distributed computing architectures such as Cluster Computing, Internet Computing and Metacomputing. Some of the available HTC packages are presented with a focus on the Condor system. It concludes on the importance of HTC to decrease the gap from personal computing to true parallel departmental computing on existing companies. Finally it addresses the importance of this technology to the future of global computing across multiple organizations (GRID environments).

17h15m - Application Oriented Management Services in Time-Shared Clusters (1715AOManagClusters.pdf)

José Manuel A. Pereira

Abstract: The need for cluster management software is increasing as cluster computing becomes more common. This communication presents and briefly explains some services that applications usually require, and therefore should be provided in cluster software management packages; it also looks into some cluster software management packages that provide those services. An application oriented step by step method to select a cluster software management package is presented, and its drawbacks are analysed.

17h30m - Run-Time Management of Heterogeneous Shared Clusters (1730RT-ManagShClusters.pdf)

António Paulo S. S. Santos

Abstract: The requirements for High Performance Computing (HPC) and/or High Throughput Computing (HTC) are increasing worldwide. To achieve high performance standards in a shared cluster (SC), based on COTS (commodities off the shelf) platforms, the following requirements were identified: (i) heterogeneity, which will provide capabilities to operate with different operating systems, computing nodes, storage technologies and communication topologies and standards; (ii) on-line management, which will provide run-time analysis and administration of CPU load, memory usage, disk space and communication bandwidth, for an efficient decision on job allocation for the right computing node. This presentation discusses some of the needed features, for the heterogeneous and run-time management, with a review on available non-proprietary cluster computing software

17h45m - Benchmarking On-the-fly: a Need for Dynamic Load Distribution in Parallel/Distributed Environments

António Alexandre S. Gouveia

Abstract: Standard benchmark suites are a popular way to measure and compare computers performance and there are already several implementations to rate performance on "static" machines. However, there is a lack on metrics to help estimating the performance of a computing node in a shared environment, aiming to decide, on-the-fly, how to distribute the workload. This communication starts by identifying the most relevant parameters on managing the runtime load distribution (e.g. latency time, memory hierarchy, ...) and then it addresses some metrics which aim an efficient workload distribution.

18h00m - Closing Session

Editor's Message

A PG course in a fast moving technological domain usually accepts applications from students with considerably different backgrounds. To lecture Computer Architecture (CA) in an M. Sc. in Informatics, under these conditions, places a real challenge: how to seduce students with no previous knowledge in CA, and simultaneously motivate those with a solid and updated background. The obvious solution is to customize its contents to each student; but is it feasible for over 20 students, with only one lecture who already shares this lecturing activities with other academic duties?

The ICCA approach attempts to complement the traditional set of academic lectures - which gives an updated overview of the more relevant topics in CA - with the individual commitment of each student to further explore a particular interest area in CA.
ICCA plays with words (Internal is very close to International...) to encourage this new breed of scientists to organize their literature search, to filter the relevant material out of so many available sources, to structure their minds to produce a coherent message to communicate to the fellow "scientists", to practice the basic rules of science report writing (and to follow the author's instructions based on the well known "Lecture Notes on ..."), and to overcome the fear of a public talk.

This is the 3rd year where this integrated approach is being applied in CA; this is also the 3rd time I have to write this "Editor's Message", avoiding to repeat what was said before... Come and visit also ICCA'2000! As the organizer of this event, I proudly state that I am very pleased once again with the enthusiasm the students showed to produce, high quality communications, within tight schedules. Each student will receive a printed copy of the proceedings on the presentation day, which, for the first time, will include a class photo! The whole contents of ICCA'02 is available in http://gec.di.uminho.pt/discip/minf/ac0102/icca02.htm.

My sincere congratulations and many thanks to all who contributed once more to make this event a successful one, namely all the M. Sc. students and my colleagues who played the role of external referees and session chairmen (António Joaquim Esteves, António José Fernandes, António Manuel Pina, João Luís Sobral, João Miguel Fernandes e Luís Paulo Santos).

Braga, 24-Jan-02

Alberto José Proença

PS. A copy of the ICCA'02 Proceedings is also available in zip format.

Back to the top ...