Last update:
Thu Apr 6 06:35:30 MDT 2023
Anonymous Dedication: Daniel L. Slotnick, 1931 to 1985 . . . . . . . . . . . . . . . . . . 5--6 Anonymous Editorial . . . . . . . . . . . . . . . 7--8 Anonymous Contributing Authors . . . . . . . . . . 9--12 Howard Jay Siegel and William Tsun-Yuk Hsu and Menkae Jeng An introduction to the multistage cube family of interconnection networks . . . 13--42 David H. Bailey A high-performance fast Fourier transform algorithm for the Cray-2 . . . 43--60 Mary Jane Irwin and Robert Michael Owens Digit pipelined processors . . . . . . . 61--86 Joanne L. Martin and Dieter Mueller-Wichards Supercomputer Performance Evaluation: Status and Directions . . . . . . . . . 87--104 J. L. Bell and G. S. Patterson, Jr. Data organization in large numerical computations . . . . . . . . . . . . . . 105--136
Carlton M. Osburn and Arnold Reisman Challenges in advanced semiconductor technology for high-performance and supercomputer applications . . . . . . . 149--189 Duncan A. Buell Factoring: Algorithms, computations, and computers . . . . . . . . . . . . . . . 191--216 M. C. Wunderlich and H. C. Williams A parallel version of the continued fraction integer factoring algorithm . . 217--230 H. L. Resnikoff Cost-effectiveness of concurrent supercomputers . . . . . . . . . . . . . 231--262
Thomas R. Caron and Robert D. Silverman Parallel Implementation of the Quadratic Sieve . . . . . . . . . . . . . . . . . 273--290 Roger. G. Grimes Solving Systems of Large Dense Linear Equations . . . . . . . . . . . . . . . 291--299 D. A. Buell and D. A. Carlson and Yuan-chieh Chow and K. Culik and N. Deo and R. Finkel and E. N. Houstis and E. M. Jacobson and Z. M. Kedem and J. S. Kowalik and P. J. Kuekes and J. L. Martin and G. A. Michael and N. S. Ostlund and J. Potter and D. K. Pradhan and M. J. Quinn and G. W. Stewart and Q. F. Stout and L. Watson and J. Webb Letter to the Editor: Parallel algorithms and architectures: report of a workshop . . . . . . . . . . . . . . . 301--325 W. P. Petersen Some vectorized random number generators for uniform, normal, and Poisson distributions for CRAY X-MP . . . . . . 327--335 John Riganati Book review . . . . . . . . . . . . . . 337--339 Anonymous Contributing authors . . . . . . . . . . 341--342 Anonymous Editorial . . . . . . . . . . . . . . . ??
Nigel P. Topham and Amos Omondi and Roland N. Ibbett On the Design and Performance of Conventional Pipelined Architectures . . 353--393 Clyde P. Kruskal and Carl H. Smith On the Notion of Granularity . . . . . . 395--408 Richard E. Anderson and Roger G. Grimes and Horst D. Simon Performance comparison of the CRAY X-MP/24 with SDD and the CRAY-2 . . . . 409--419 Anonymous Contributing authors . . . . . . . . . . 421--424 Anonymous Editorial . . . . . . . . . . . . . . . ?? James F. Koehler Dedication: Michael J. Kascic, 1941--1987 . . . . . . . . . . . . . . . ??
Vladimir Cherkassky and Ross Smith Efficient Mapping and Implementation of Matrix Algorithms on a Hypercube . . . . 7--27 Nigel P. Topham and Amos Omondi and Roland N. Ibbett Context Flow: An Alternative to Conventional Pipelined Architectures . . 29--53 Youngju Won and Sartaj Sahni Maze Routing on a Hypercube Multicomputer . . . . . . . . . . . . . 55--79 S. Lakshmivarahan and Sudarshan K. Dhall A New Hierarchy of Hypercube Interconnection Schemes for Parallel Computers . . . . . . . . . . . . . . . 81--108 Anonymous Contributing authors . . . . . . . . . . 109--112 Anonymous Glossary of terms (supercomputing) . . . 113--123
Utpal Banerjee An Introduction to a Formal Theory of Dependence Analysis . . . . . . . . . . 133--149 David Callahan and Ken Kennedy Compiling Programs for Distributed-Memory Multiprocessors . . . 151--169 Marina Chen and Young-Il Choo and Jingke Li Compiling Parallel Programs by Optimizing Performance . . . . . . . . . 171--207 Laxmikant V. Kale and David A. Padua and David C. Sehr OR Parallel Execution of Prolog Programs with Side Effects . . . . . . . . . . . 209--223 Zhiyuan Li and Pen-Chung Yew Program Parallelization with Interprocedural Analysis . . . . . . . . 225--244 Anonymous Contributing authors . . . . . . . . . . 245--250 Anonymous Editorial . . . . . . . . . . . . . . . ?? Anonymous Publisher's announcement . . . . . . . . ??
Zhijing G. Mou and Paul Hudak An Algebraic Model for Divide-and-Conquer and Its Parallelism 257--278 Alexandru Nicolau and Keshav Pingali and Alexander Aiken Fine-Grain Compilation for Pipelined Machines . . . . . . . . . . . . . . . . 279--295 Constantine D. Polychronopoulos Toward Auto-Scheduling Compilers . . . . 297--330 Jon A. Solworth Programming Language Constructs for Highly Parallel Operations on Lists . . 331--347 Min-You Wu and Daniel D. Gajski A Programming Aid for Hypercube Architectures . . . . . . . . . . . . . 349--372 Anonymous Contributing authors . . . . . . . . . . 373--376 Anonymous Publisher's announcement . . . . . . . . ??
Michael Bieterman Microtasking General Purpose Partial Differential Equation Software on the CRAY X-MP . . . . . . . . . . . . . . . 381--413 Jim Armstrong A multi-algorithm approach to very high performance one-dimensional FFTs . . . . 415--433 Youngju Won and Sartaj Sahni A Balanced Bin Sort for Hypercube Multicomputers . . . . . . . . . . . . . 435--448 Anonymous Contributing authors . . . . . . . . . . 449--450
Anonymous Editorial . . . . . . . . . . . . . . . 3--3 D. C. S. Allison and Amal Chakraborty and Layne T. Watson Granularity Issues for Solving Polynomial Systems via Globally Convergent Algorithms on a Hypercube . . 5--20 A. Yavuz Oruç and Seth Schneider Coset Networks as Connectors in Parallel Processors . . . . . . . . . . . . . . . 21--39 Youngju Won and Sartaj Sahni Hypercube-to-Host Sorting . . . . . . . 41--61 Anonymous Contributing authors . . . . . . . . . . 63--65
Michael Burke and Ron Cytron and Jeanne Ferrante and Wilson Hsieh Automatic generation of nested, fork-join parallelism . . . . . . . . . 71--88 Duncan A. Buell and Robert L. Ward A Multiprecise Integer Arithmetic Package . . . . . . . . . . . . . . . . 89--107 R. A. Fatoohi Multitasking a Navier--Stokes Algorithm on the CRAY-2 . . . . . . . . . . . . . 109--124 Yau Shu Wong and Hong Jiang Approximate Polynomial Preconditionings Applied to Biharmonic Equations . . . . 125--145 Anonymous Contributing authors . . . . . . . . . . 147--148
Andrew W. Appel and Aage Bendiksen Vectorized Garbage Collection . . . . . 151--160 Laurence Boxer and Russ Miller Dynamic Computational Geometry on Meshes and Hypercubes . . . . . . . . . . . . . 161--191 Christian H. Bischof Adaptive Blocking in the QR Factorization . . . . . . . . . . . . . 193--208 Jinwoon Woo and Sartaj Sahni Letter to the Editor: Hypercube Computing: Connected Components . . . . 209--234 K. Ranai and B. Srinivasan Complexity analysis of benchmark software for supercomputers . . . . . . 235--242 Anonymous Contributing authors . . . . . . . . . . 243--245
Eric Lander and Jill P. Mesirov and Washington Taylor, IV Study of Protein Sequence Comparison Metrics on the Connection Machine CM-2 255--269 Rajiv Gupta and Mary Lou Soffa Compilation Techniques for a Reconfigurable LIW Architecture . . . . 271--304 Daniel V. Pryor and Patrick J. Burns Vectorized Monte Carlo molecular aerodynamics simulation of the Rayleigh problem . . . . . . . . . . . . . . . . 305--330 Margreet Louter-Nool Letter to the Editor: LINPACK routines based on level 2 BLAS . . . . . . . . . 331--349 (or 331--339??) Jon Lee A quirk of the CRAY CFT77 compiler: IF (logical) in lieu of IF (arithmetic) . . 351--355 Anonymous Contributing authors . . . . . . . . . . 357--359
Gurindar S. Sohi and Wei-Chung Hsu The Use of Intermediate Memories for Low-Latency Memory Access in Supercomputer Scalar Units . . . . . . . 5--21 David H. Bailey FFTs in External or Hierarchical Memory 23--35 Mark D. Guzzi and David A. Padua and Jay Hoeflinger and Duncan H. Lawrie Cedar Fortran and Other Vector and Parallel Fortran Dialects . . . . . . . 37--62 Yoshikazu Tanaka and Kyouko Iwasawa and Yukio Umetani and Shizuo Gotou Compiling Techniques for First-Order Linear Recurrences on a Vector Computer 63--82 Leslie Ann Henderson and Robert E. Hiromoto and Olaf M. Lubeck and Margaret L. Simmons On the Use of Diagnostic Dependence-Analysis Tools in Parallel Programming: Experiences using PTOOL . . 83--96
David A. Carlson and Binay Sugla Limited Width Parallel Prefix Circuits 107--129 Wayne Pfeiffer and Arnold Alagar and Anke Kamrath and Robert H. Leary and Jack Rogers Benchmarking and Optimization of Scientific Codes on the CRAY X-MP, CRAY-2, and SCS-40 Vector Computers . . 131--152 Margaret L. Simmons and Harvey J. Wasserman Performance Comparison of the CRAY-2 and CRAY X-MP/416 Supercomputers . . . . . . 153--167 Sanjay Ranka and Sartaj Sahni Computing Hough Transforms on Hypercube Multicomputers . . . . . . . . . . . . . 169--190 Anonymous Contributing authors . . . . . . . . . . 191--194
Weicheng Shen and A. Yavuz Oruç Systolic Arrays for Multidimensional Discrete Transforms . . . . . . . . . . 201--222 Peizong Lee and Zvi M. Kedem On High-Speed Computing with a Programmable Linear Array . . . . . . . 223--249 Takao Tsuda and Yoshitoshi Kunieda V-Pascal: An Automatic Vectorizing Compiler for Pascal with no Language Extensions . . . . . . . . . . . . . . . 251--275 Clive F. Baillie and Desmond A. Johnston and Gregory W. Kilcup Letter to the Editor: Status and prospects of the computational approach to high-energy physics . . . . . . . . . 277--300 Arnold Reisman Letter to the Editor: Ionizing radiation effects on ULSI device yield and reliability . . . . . . . . . . . . . . 301--310 Anonymous Contributing authors . . . . . . . . . . 311--314
Anonymous Editorial . . . . . . . . . . . . . . . 319--320 Michael Wolfe Data Dependence and Program Restructuring . . . . . . . . . . . . . 321--344 David A. Carlson Using Local Memory to Boost the Performance of FFT Algorithms on the CRAY-2 Supercomputer . . . . . . . . . . 345--356 David H. Bailey and King Lee and Horst D. Simon Using Strassen's algorithm to accelerate the solution of linear systems . . . . . 357--371 Anonymous Contributing authors . . . . . . . . . . 373--374
A. E. Brenner and J. P. Riganati and P. B. Schneck Sidney Fernbach --- 1917--1991 --- In Memoriam . . . . . . . . . . . . . . . . 5--6 Luis F. Ortiz and Ron Y. Pinter and Shlomit S. Pinter An Array Language for Data Parallelism: Definition, Compilation, and Applications . . . . . . . . . . . . . . 7--29 Tom MacDonald C for Numerical Computing . . . . . . . 31--48 George Marsaglia Normal (Gaussian) Random Variables for Supercomputers . . . . . . . . . . . . . 49--55 Lixin Tao and Eva Ma Simulating Parallel Neighboring Communications among Square Meshes and Square Toruses . . . . . . . . . . . . . 57--71 Jinwoon Woo and Sartaj Sahni Computing Biconnected Components on a Hypercube . . . . . . . . . . . . . . . 73--87 Anonymous Contributing authors . . . . . . . . . . 89--92
Anonymous Editorial . . . . . . . . . . . . . . . 97--97 Lewis Stiller Group Graphs and Computational Symmetry on Massively Parallel Architecture . . . 99--117 Parthasarathy P. Tirumalai and Meng Lee and Michael S. Schlansker Parallelization of WHILE Loops on Pipelined Architectures . . . . . . . . 119--136 Allen D. Malony and John L. Larson and Daniel A. Reed Tracing application program execution on the Cray X-MP and Cray-2 . . . . . . . . 137--162 Yong-Fong Lee and Barbara G. Ryder and Thomas J. Marlowe Experiences with a Parallel Algorithm for Data Flow Analysis . . . . . . . . . 163--188 R. W. Johnson and C. H. Huang and J. R. Johnson Multilinear Algebra and Parallel Programming . . . . . . . . . . . . . . 189--217 Zarka Cvetanovic and Edward G. Freedman and Charles Nofsinger Efficient Decomposition and Performance of Parallel PDE, FFT, Monte Carlo Simulations, Simplex, and Sparse Solvers 219--238 Anonymous Contributing authors . . . . . . . . . . 239--244
Guest Editorial Editorial . . . . . . . . . . . . . . . 249--250 Ken Kennedy Software for Supercomputers of the Future . . . . . . . . . . . . . . . . . 251--262 Henry G. Dietz and Abderrazek Zaafrani and Matthew T. O'Keefe Static Scheduling for Barrier MIMD Architectures . . . . . . . . . . . . . 263--289 Qasim Sheikh and Phuong Vu and Chao Yang and Michael Merchant Implementation of the Level 2 and 3 BLAS on the CRAY Y-MP and the CRAY-2 . . . . 291--305 Stanley C. Ahalt and Prakoon Chen and Cheng-Taou Chou and Tzyy-Ping Jung Implementation of a Vector Quantization Codebook Design Technique based on a Competitive Learning Artificial Neural Network . . . . . . . . . . . . . . . . 307--330 Sreejit Chakravarty and Ajay Shekhawat Parallel and Serial Heuristics for the Minimum Set Cover Problem . . . . . . . 331--345 D. C. S. Allison and K. M. Irani and C. J. Ribbens and L. T. Watson High-Dimensional Homotopy Curve Tracking on a Shared-Memory Multiprocessor . . . 347--366 Anonymous Contributing authors . . . . . . . . . . 367--372
Mikhail N. Dorozhevets and Peter Wolcott The El'brus-3 and MARS-M: Recent Advances in Russian High-Performance Computing . . . . . . . . . . . . . . . 5--48 Allen R. Hainline and Steven R. Thompson and Lawrence L. Halcomb Vector Performance Estimation for CRAY X-MP/Y-MP Supercomputers . . . . . . . . 49--70 Shlomo Weiss Memory Conflict Resolution in Vector Supercomputers . . . . . . . . . . . . . 71--85 Narsingh Deo and Sushil Prasad Parallel heap: an optimal parallel priority queue . . . . . . . . . . . . . 87--98 Anonymous Contributing authors . . . . . . . . . . 99--101
David A. Carlson Ultrahigh-performance FFTS for the Cray-2 and Cray Y-MP supercomputers . . 107--116 V. Venkatakrishnan and Horst D. Simon and Timothy J. Barth A MIMD Implementation of a Parallel Euler Solver for Unstructured Grids . . 117--137 S. J. Plimpton and J. R. Michael and A. D. Romig, Jr. Parallel Simulation of Electron-Solid Interactions for Electron Microscopy Modeling . . . . . . . . . . . . . . . . 139--151 G. A. Geist and B. Ginatempo and W. A. Shelton and G. M. Stocks Parallel Superconductor Code on the iPSC/860 . . . . . . . . . . . . . . . . 153--162 Daniel R. Ripoll and Stephen J. Thomas A Parallel Monte Carlo Search Algorithm for the Conformational Analysis of Polypeptides . . . . . . . . . . . . . . 163--185 Anonymous Contributing authors . . . . . . . . . . 187--190
Joachim Frank and Siegfried Knecht Lattice quantum hadrodynamics on a Cray Y-MP . . . . . . . . . . . . . . . . . . 195--209 Yousry Y. Azmy Performance and Performance Modeling of a Parallel Algorithm for Solving the Neutron Transport Equation . . . . . . . 211--235 Steven G. Kratzer Sparse QR Factorization on a Massively Parallel Computer . . . . . . . . . . . 237--255 Yeh-Ching Chung and Sanjay Ranka Mapping Finite Element Graphs on Hypercubes . . . . . . . . . . . . . . . 257--282 Maya Gokhale and William Carlson An Introduction to Compilation Issues for Parallel Machines . . . . . . . . . 283--314 Glenn R. Luecke and Jae Heon Yun and Philip W. Smith Performance of Parallel Cholesky Factorization Algorithms Using BLAS . . 315--329 Anonymous Contributing authors . . . . . . . . . . 331--334
Richard Draper and John Riganati Special Issue on Instruction-Level Parallelism --- Introduction . . . . . . 5--5 Josh Fisher and Bob Rau Guest Editors' Introduction . . . . . . 7--7 B. Ramakrishna Rau and Joseph A. Fisher Instruction-Level Parallel Processing: History, Overview, and Perspective . . . 9--50 P. Geoffrey Lowney and Stefan M. Freudenberger and Thomas J. Karzes and W. D. Lichtenstein and Robert P. Nix and John S. O'Donnell and John C. Ruttenberg The Multiflow Trace Scheduling Compiler 51--142 Gary R. Beck and David W. L. Yen and Thomas L. Anderson The Cydra 5 Minisupercomputer: Architecture and Implementation . . . . 143--180 James C. Dehnert and Ross A. Towle Compiling for the Cydra 5 . . . . . . . 181--227 Wen-mei W. Hwu and Scott A. Mahlke and William Y. Chen and Pohua P. Chang and Nancy J. Warter and Roger A. Bringmann and Roland G. Ouellette and Richard E. Hank and Tokuzo Kiyohara and Grant E. Haab and J. G. Holm and D. M. Lavery The Superblock: An Effective Technique for VLIW and Superscalar Compilation . . 229--248 Michael A. Schuette and John P. Shen Instruction-level experimental evaluation of the Multiflow TRACE 14/300 VLIW computer . . . . . . . . . . . . . 249--271 Anonymous Contributing authors . . . . . . . . . . 273--282
Richard J. Procassini and Scott R. Whitman and William P. Dannevik Porting a Global Ocean Model onto a Shared-Memory Multiprocessor: Observations and Guidelines . . . . . . 287--321 Per Ling A Set of High-Performance Level 3 BLAS Structured and Tuned for the IBM 3090 VF and Implemented in Fortran 77 . . . . . 323--355 Raymond R. Glenn and Daniel V. Pryor and John M. Conroy and Theodore Johnson A Bistability Throughput Phenomenon in a Shared-Memory MIMD Machine . . . . . . . 357--375 Anonymous Contributing authors . . . . . . . . . . 377--379
Anonymous Editorial . . . . . . . . . . . . . . . 385--386 Kathleen Knobe and Venkataraman Natarajan Automatic Data Allocation to Minimize Communication on SIMD Machines . . . . . 387--415 R. G. Hohlfeld and N. F. Comins and D. Shalit and P. A. Shorey and R. C. Giles Implementation of Particle-in-Cell Stellar Dynamics Codes on the Connection Machine-2 . . . . . . . . . . . . . . . 417--436 Steven R. Thompson and Allen R. Hainline and Lawrence L. Halcomb Vector performance estimation for Cray X-MP/Y-MP supercomputers, Part 2 . . . . 437--467 Anonymous Contributing authors . . . . . . . . . . 469--471
Gordon Lyon and Robert Snelick and Raghu Kacker Synthetic-Perturbation Tuning of MIMD Programs . . . . . . . . . . . . . . . . 5--28 J. G. Blom and J. G. Verwer Vectorizing Matrix Operations Arising From PDE Discretization on $9$-Point Stencils . . . . . . . . . . . . . . . . 29--51 Nashat Mansour and Geoffrey C. Fox Parallel Physical Optimization Algorithms for Allocating Data to Multicomputer Nodes . . . . . . . . . . 53--80
R. A. Fatoohi Adapting a Navier--Stokes Solver for Three Parallel Machines . . . . . . . . 91--115 Charlotte Froese Fischer and Ming Tong and Murry Bentley and Zuchang Shen and C. Ravimohan A Distributed-Memory Implementation of the MCHF Atomic Structure Package . . . 117--134 Gary W. Elsesser and Viet N. Ngo and Sourav Bhattacharya and Wei-Tek Tsai Processor Preallocation and Load Balancing of DOALL Loops . . . . . . . . 135--161 Mazin S. Yousif and Chita R. Das and Matthew J. Thazhuthaveetil A Cache Coherence Protocol for MIN-Based Multiprocessors . . . . . . . . . . . . 163--185 Anonymous Contributing authors . . . . . . . . . . 187--189
Bruce A. Shapiro and Joseph Navetta A Massively Parallel Genetic Algorithm for RNA Secondary Structure Prediction 195--207 Kay A. Robbins and Steven Robbins Relationship Between Average and Real Memory Behavior . . . . . . . . . . . . 209--232 Yong-Fong Lee and Barbara G. Ryder Effectively Exploiting Parallelism in Data Flow Analysis . . . . . . . . . . . 233--262 Alfred J. Boals and Ajay K. Gupta and Naveed A. Sherwani Incomplete hypercubes: Algorithms and embeddings . . . . . . . . . . . . . . . 263--294 Anonymous Contributing authors . . . . . . . . . . 295--298
Rajeev Thakur and Ravi Ponnusamy and Alok Choudhary and Geoffrey Fox Complete Exchange on the CM-5 and Touchstone Delta . . . . . . . . . . . . 305--328 D. J. Mavriplis and Raja Das and Joel Saltz and R. E. Vermeland Implementation of a Parallel Unstructured Euler Solver on Shared- and Distributed-Memory Architectures . . . . 329--344 K. Farkas and Z. Vranesic and M. Stumm Scalable Cache Consistency for Hierarchically Structured Multiprocessors . . . . . . . . . . . . 345--369 R. H. B. Netzer and B. P. Miller Optimal Tracing and Replay for Debugging Message-Passing Parallel Programs . . . 371--388
Min-You Wu and Wolfgang Schröder-Preikschat and R. Draper and J. Riganati Special Issue on Trends in Parallel Operating Systems . . . . . . . . . . . 5--6 Andrew S. Tanenbaum A Comparison of Three Microkernels . . . 7--22 Anand R. Tripathi and Neeran M. Karnik Trends in Multiprocessor and Distributed Operating Systems Designs . . . . . . . 23--49 Mahendra Ramachandran and Mukesh Singhal Decentralized Semaphore Support in a Virtual Shared-Memory System . . . . . . 51--70 H. Assenmacher and T. Breitbach and P. Buhler and V. Hubsch and H. Peine and R. Schwarz Parallel Programming in \sc Panda . . . 71--90 Wei Shu Run-Time Support for User-Level Ultralightweight Threads on Distributed-Memory Computers . . . . . . 91--103 Ronald C. Unrau and Orran Krieger and Benjamin Gamsa and Michael Stumm Hierarchical Clustering: a Structure for Scalable Multiprocessor Operating System Design . . . . . . . . . . . . . . . . . 105--134 Ishfaq Ahmad A massively parallel fault-tolerant architecture for time-critical computing 135--162 Kevin R. Wadleigh and Gary B. Gostin and John Liu High-Performance FFT Algorithms for the Convex C4/XA Supercomputer . . . . . . . 163--178 Vilmar Trevisan and João B. Carvalho The Composite Character of the Twenty-Second Fermat Number . . . . . . 179--182 Jang-Ping Sheu and Tzung-Shi Chen Partitioning and Mapping of Nested Loops for Linear Array Multicomputers . . . . 183--202 Min-You Wu Contributing Authors . . . . . . . . . . 203--211 Anonymous Editorial . . . . . . . . . . . . . . . ??
Duncan A. Buell and Kenneth L. Pocek Custom computing machines: an introduction . . . . . . . . . . . . . . 219--229 Christian Iseli and Eduardo Sanchez Spyder: A SURE (SUperscalar and REconfigurable) Processor . . . . . . . 231--252 Xiaoping Ling and Hideharu Amano WASMII: An MPLD with Data-Driven Control on a Virtual Hardware . . . . . . . . . 253--276 Jeffrey M. Arnold The Splash 2 Software Environment . . . 277--290 Maya Gokhale and Brian Schott Data-Parallel C on a Reconfigurable Logic Array . . . . . . . . . . . . . . 291--313 Marianne E. Louie and Milos D. Ercegovac A Variable-Precision Square Root Implementation for Field Programmable Gate Arrays . . . . . . . . . . . . . . 315--336 Anonymous Contributing Authors . . . . . . . . . . 337--340
Anonymous Editorial . . . . . . . . . . . . . . . 345--346 Michael Prestin and Leonid Shtilman A Parallel Navier--Stokes Solver: The Meiko Implementation . . . . . . . . . . 347--364 J. Ramanujam Beyond Unimodular Transformations . . . 365--389 Hui Gao and John L. Larson Workload characterization using the Cray Hardware Performance Monitor . . . . . . 391--412 Anonymous Contributing Authors . . . . . . . . . . 413--414
Robert Cypher and Alex Ho and Smaragda Konstantinidou and Paul Messina A Quantitative Study of Parallel Scientific Applications with Explicit Communication . . . . . . . . . . . . . 5--24 Gordon D. B. Cameron and Gordon I. D. Duncan PARAMICS --- Parallel Microscopic Simulation of Road Traffic . . . . . . . 25--53 Seungjo Bae and Sanjay Ranka A Comparison of Different Message-Passing Paradigms for the Parallelization of Two Irregular Applications . . . . . . . . . . . . . . 55--85 Douglas C. Burger and Rahmat S. Hyder and Barton P. Miller and David A. Wood Paging Tradeoffs in Distributed-Shared-Memory Multiprocessors . . . . . . . . . . . . 87--104 Anonymous Contributing Authors . . . . . . . . . . 105
Chao-Wei Ou and Sanjay Ranka and Geoffrey Fox Fast and Parallel Mapping Algorithms for Irregular Problems . . . . . . . . . . . 119--140 David A. Bader and Joseph JáJá and David Harwood and Larry S. Davis Parallel Algorithms for Image Enhancement and Segmentation by Region Growing, with an Experimental Study . . 141--168 Jaros\law Nieplocha and Robert J. Harrison and Richard J. Littlefield Global Arrays: a Nonuniform Memory Access Programming Model for High-Performance Computers . . . . . . . 169--189 Kent E. Seamons and Marianne Winslett Multidimensional Array I/O in \sc Panda 1.0 . . . . . . . . . . . . . . . . . . 191--211 Anonymous Contributing Authors . . . . . . . . . . 213
Tarek S. Abdelrahman Latency Hiding on COMA Multiprocessors 225--242 Hamid R. Arabnia and Suchendra M. Bhandarkar Parallel Stereocorrelation on a Reconfigurable Multi-Ring Network . . . 243--269 Joanne F. Houlahan and Lenore J. Cowen and Gerald M. Masson Hypercube Sandwich Approach to Conferencing . . . . . . . . . . . . . . 271--283 David Shoemaker and Frank Honoré and Chris Metcalf and Steve Ward NuMesh: An Architecture Optimized for Scheduled Communication . . . . . . . . 285--302 Anonymous Contributing Authors . . . . . . . . . . 303--305 H. R. Arabnia A Special Issue on Parallel and Distributed Processing . . . . . . . . . U2--U3 R. Draper and J. Riganati Daniel Shanks, 1917--1996 --- In Memoriam . . . . . . . . . . . . . . . . U3--U3
Hassan Fallah-Adl and Joseph JáJá and Shunlin Liang Fast Algorithms for Estimating Aerosol Optical Depth and Correcting Thematic Mapper Imagery . . . . . . . . . . . . . 315--329 J. Wang and P. Liewer and E. Huang Three-Dimensional Electromagnetic Particle-in-Cell with Monte Carlo Collision Simulations on Three MIMD Parallel Computers . . . . . . . . . . . 331--348 Ishfaq Ahmad and Ka-Cheong Leung and Hsiao-Ming Hsu Multiprocessing Ocean Circulation: Modeling, Implementation, and Performance on the Intel Paragon . . . . 349--369 Brandon Dixon and John Swallow High-Performance Sorting Algorithms for the CRAY T3D Parallel Computer . . . . . 371--395 Anonymous Contributing Authors . . . . . . . . . . 397--400
Richard Draper and John Riganati Farewell Editorial . . . . . . . . . . . 5--6 A. Averbuch and L. Ioffe and M. Israeli and L. Vozovoi Highly Scalable Two- and Three-Dimensional Navier--Stokes Parallel Solvers on MIMD Multiprocessors 7--39 D. N. Jayasimha and M. E. Hayder and S. K. Pillay An Evaluation of Architectural Platforms for Parallel Navier--Stokes Computations 41--60 Suchendra M. Bhandarkar and Salem Machaka Chromosome Reconstruction from Physical Maps Using a Cluster of Workstations . . 61--86 Anonymous Contributing Authors . . . . . . . . . . 87
Salim Hariri Editorial: Special Issue on High Performance Distributed Computing . . . 99--99 K. Mani Chandy and Joseph Kiniry and Adam Rifkin and Daniel Zimmerman Webs of Archived Distributed Computations for Asynchronous Collaboration . . . . . . . . . . . . . 101--118 Jarek Nieplocha and Robert J. Harrison Shared Memory Programming in Metacomputing Environments: The Global Array Approach . . . . . . . . . . . . . 119--136 Luis M. Silva and João Gabriel Silva Checkpointing Distributed Shared Memory 137--158 Sung-Yong Park and Salim Hariri A High Performance Message Passing System for Network of Workstations . . . 159--180 Prasenjit Sarkar and Mary Bailey Adapting the Network Interface for High-Performance Computing: The CNI Approach . . . . . . . . . . . . . . . . 181--200 Anonymous Contributing Authors . . . . . . . . . . 201--204
Phil Andrews Supercomputing in Medicine --- Introduction . . . . . . . . . . . . . . 211--212 David M. McQueen and Charles S. Peskin Shared-Memory Parallel Vector Implementation of the Immersed Boundary Method for the Computation of Blood Flow in the Beating Mammalian Heart . . . . . 213--236 Alexander J. Ropelewski and Hugh B. Nicholas, Jr. and David W. Deerfield, II Implementation of Genetic Sequence Alignment Programs on Supercomputers . . 237--253 Michael F. Crowley and Tom A. Darden and Thomas E. Cheatham, III and David W. Deerfield, II Adventures in Improving the Scaling and Accuracy of a Parallel Molecular Dynamics Program . . . . . . . . . . . . 255--278 Arthur W. Wetzel Computational Aspects of Pathology Image Classification and Retrieval . . . . . . 279--293 Nigel H. Goddard and Greg Hood and Jonathan D. Cohen and William F. Eddy and Christopher R. Genovese and Douglas C. Noll and Leigh E. Nystrom Online Analysis of Functional MRI Datasets on Parallel Platforms . . . . . 295--318 Anonymous Contributing Authors . . . . . . . . . . 319--324
Peter Graham and Brian J. d'Auriol Editorial: Special Issue --- High-Performance Computing Systems . . . 331--331 Christian Lengauer and Sergei Gorlatch and Christoph A. Herrmann The Static Parallelization of Loops and Recursions . . . . . . . . . . . . . . . 333--353 Matthias Brune and Jörn Gehring and Alexander Reinefeld Heterogeneous Message Passing and a Link to Resource Management . . . . . . . . . 355--369 Edward Chan and Daoxu Chen and Victor Lee and Jiannong Cao and Chan-Hee Lee On the Effectiveness of the FDDI-M Medium Access Protocol for Real-Time Traffic . . . . . . . . . . . . . . . . 371--389 Keqin Li Constant Time Boolean Matrix Multiplication on a Linear Array with a Reconfigurable Pipelined Bus System . . 391--403 Kuang-Chih Liu and Chung-Ta King A Performance Study on Bounteous Transfer in Multiprocessor Sectored Caches . . . . . . . . . . . . . . . . . 405--420 Anonymous Contributing Authors . . . . . . . . . . 421--422
Hamid R. Arabnia Editorial . . . . . . . . . . . . . . . 5--6 James B. Armstrong and Muthucumaru Maheswaran and Mitchell D. Theys and Howard Jay Siegel and Mark A. Nichols and Kenneth H. Casey Parallel Image Correlation: Case Study to Examine Trade-Offs in Algorithm-to-Machine Mappings . . . . . 7--35 A. M. Wallace and G. J. Michaelson and N. Scaife and W. J. Austin A Dual Source, Parallel Architecture for Computer Vision . . . . . . . . . . . . 37--56 Alan Heirich and James Arvo A Competitive Analysis of Load Balancing Strategies for Parallel Ray Tracing . . 57--68 Hongchi Shi and Paul Gader and Hongzheng Li Parallel Mesh Algorithms for Grid Graph Shortest Paths with Application to Separation of Touching Chromosomes . . . 69--83 Sergei Gorlatch Programming with Divide-and-Conquer Skeletons: a Case Study of FFT . . . . . 85--97 Linda Yang and Manavendra Misra Coarse-Grained Parallel Algorithms for Multi-Dimensional Wavelet Transforms . . 99--118 Dominique Houzet and Abdelkrim Fatni PC-Based Shared Memory Architecture and Language . . . . . . . . . . . . . . . . 119--136 Frederic H. Mabin and Catherine Mongenet A Parallel Algorithm to Reconstruct Bounding Surfaces in $3$D Images . . . . 137--155 Zoltan Johasz An Analytical Method for Predicting the Performance of Parallel Image Processing Operations . . . . . . . . . . . . . . . 157--174
Moez Ayed and Jean-Luc Gaudiot Analysis of a Heuristic for Code Partitioning . . . . . . . . . . . . . . 191--226 Thomas Fahringer Efficient Symbolic Analysis for Parallelizing Compilers and Performance Estimators . . . . . . . . . . . . . . . 227--252 Ching-Hsien Hsu and Yeh-Ching Chung Efficient Methods for $ k r \longrightarrow r $ and $ r \longrightarrow k r $ Array Redistribution . . . . . . . . . . . . . 253--276 Shahram Latifi and Pradip K. Srimani SEP: a Fixed Degree Regular Network for Massively Parallel Systems . . . . . . . 277--291
Srinivas Aluru and John Gustafson and G. M. Prabhu and Fatih E. Sevilgen Distribution-Independent Hierarchical Algorithms for the $N$-Body Problem . . 303--323 Sekhar Darbha and Santosh Pande A Robust Compile Time Method for Scheduling Task Parallelism on Distributed Memory Machines . . . . . . 325--347 Tarek S. Abdelrahman and Thomas N. Wong Compiler Support for Array Distribution on NUMA Shared Memory Multiprocessors 349--371 Yen-Chun Lin and Jyh-Chian Chen An Efficient Systolic Algorithm for the Longest Common Subsequence Problem . . . 373--385 John R. Budenske and Ranga S. Ramanujan and Howard Jay Siegel A Method for the On-Line Use of Off-Line Derived Remappings of Iterative Automatic Target Recognition Tasks onto a Particular Class of Heterogeneous Parallel Platforms . . . . . . . . . . . 387--406
Kei-Chun Li and Kang Zhang Supporting Scalable Performance Monitoring and Analysis of Parallel Programs . . . . . . . . . . . . . . . . 5--31 C. Eckert and M. Pizka Improving Resource Management in Distributed Systems using Language-Level Structuring Concepts . . . . . . . . . . 33--55 Youichi Omori and Akira Fukuda and Kazuki Joe An Object-Oriented Framework for Loop Parallelization . . . . . . . . . . . . 57--69 Ada Waichee Fu and Siu-Cheung Chau Locating Corruptions in a Replicated File in a Distributed Environment . . . 71--97
Mounir Hamdi and Yi Pan and B. Hamidzadeh and F. M. Lim Parallel Computing on an Ethernet Cluster of Workstations: Opportunities and Constraints . . . . . . . . . . . . 111--132 Pablo Galdámez and Declan Murphy and José M. Bernabéu-Aubán and Francesc D. Muñoz-Escoó Event-Based Techniques to Debug an Object Request Broker . . . . . . . . . 133--149 Soon M. Chung and Arindam Chatterjee Parallel Distributive Join Algorithm on the Intel Paragon . . . . . . . . . . . 151--169 José M. Claver Parallel Wavefront Algorithms Solving Lyapunov Equations for the Cholesky Factor on Message Passing Multiprocessors . . . . . . . . . . . . 171--189 Tianruo Yang and Hai-Xiang Lin Parallel Performance Analysis of the Improved Quasi-Minimal Residual Method on Bulk Synchronous Parallel Architectures . . . . . . . . . . . . . 191--210 P. J. Coelho and P. A. Novo and M. G. Carvalho Modelling of a Utility Boiler Using Parallel Computing . . . . . . . . . . . 211--232
Mark J. Clement and Xian-He Sun Editorial . . . . . . . . . . . . . . . 247--247 Mats Brorsson and Martin Kral Performance Tuning Software DSM Applications using Visualisation . . . . 249--265 Mark J. Clement and Glenn M. Judd and Bryan S. Morse and J. Kelly Flanagan Performance Surface Prediction for WAN-Based Clusters . . . . . . . . . . . 267--281 Yong Luo Shared Memory vs. Message Passing: The COMOPS Benchmark Experiment . . . . . . 283--301 Jeffrey C. Becker and Bill Nitzberg and Rob F. Van Der Wijngaart and Maurice Yarrow Predicting Price/Performance Trade-offs for Whitney: a Commodity Computing Cluster . . . . . . . . . . . . . . . . 303--319 John L. Gustafson and Rajat Todi Conventional Benchmarks as a Sample of the Performance Spectrum . . . . . . . . 321--342
S. Q. Zheng and Balaji Calidas and Yanjun Zhang An Efficient General In-Place Parallel Sorting Scheme . . . . . . . . . . . . . 5--17 D. R. Avresky and V. Shurbanov and R. Horst and W. Watson and L. Young and D. Jewett Performance Modeling of ServerNet\TM SAN Topologies . . . . . . . . . . . . . . . 19--37 Yen-Chun Lin and Chao-Cheng Shih A New Class of Depth-Size Optimal Parallel Prefix Circuits . . . . . . . . 39--52 Fadi N. Sibai Optimal Clustering of Hierarchical Hyper-Ring Multicomputers . . . . . . . 53--76 Dianqin Wang and Eleanor Chu Minimizing Communication Penalty of Triangular Solvers by Runtime Mesh Configuration and Workload Redistribution . . . . . . . . . . . . . 77--95
Edward E. E. Frietman and Ramon J. Ernst and Roy Crosbie and Masao Shimoji Prospects for Optical Interconnects in Distributed, Shared-Memory Organized MIMD Architectures . . . . . . . . . . . 107--128 Roger Espasa and Mateo Valero A Simulation Study of Decoupled Vector Architectures . . . . . . . . . . . . . 129--152 Agostino Poggi and Giuliano Armano A Distributed Object-Oriented Language for Heterogeneous Computer Networks . . 153--169 Dolors Royo and Antonio González and Miguel Valero-García Low Communication Overhead Jacobi Algorithms for Eigenvalues Computation on Hypercubes . . . . . . . . . . . . . 171--193
Jinsong Ouyang and Piyush Maheshwari Supporting Cost-Effective Fault Tolerance in Distributed Message-Passing Applications with File Operations . . . 207--232 Hoi-Man Yip and Ishfaq Ahmad and Ting-Chuen Pong An Efficient Parallel Algorithm for Computing the Gaussian Convolution of Multi-dimensional Image Data . . . . . . 233--255 Muthucumaru Maheswaran and Kevin J. Webb and Howard Jay Siegel MCGS: a Modified Conjugate Gradient Squared Algorithm for Nonsymmetric Linear Systems . . . . . . . . . . . . . 257--280
Stephan Olariu and Ivan Stojmenovic and Albert Zomaya On the Dynamic Initialization of Parallel Computers . . . . . . . . . . . 5--24 Yeh-Ching Chung and Ching-Jung Liao and Don-Lin Yang A Prefix Code Matching Parallel Load-Balancing Method for Solution-Adaptive Unstructured Finite Element Graphs on Distributed Memory Multicomputers . . . . . . . . . . . . . 25--49 Hüuseyin Kutluca and Tah\"sin M. Kurç and Cevdet Aykanat Image-Space Decomposition Algorithms for Sort-First Parallel Volume Rendering of Unstructured Grids . . . . . . . . . . . 51--93 Toshiyuki Imamura An Estimation of Complexity and Computational Costs for Vertical Block-Cyclic Distributed Parallel LU Factorization . . . . . . . . . . . . . 95--110
J. Bourgeois and F. Spies and M. J. Zemerly and T. Delaitre Chronos: a Performance Characterization Tool Inside the EDPEPPS Toolset . . . . 123--140 Thomas Schnekenburger Load Balancing in CORBA: a Survey of Concepts, Patterns, and Techniques . . . 141--161 Keqin Li and Yi Pan and Si Qing Zheng Efficient Deterministic and Probabilistic Simulations of PRAMs on Linear Arrays with Reconfigurable Pipelined Bus Systems . . . . . . . . . 163--181 Shahram Latifi and Myung Hoon Lee and Pradip K. Srimani Wormhole Broadcast in Hypercubes . . . . 183--192 Peter Benner and Maribel Castillo and Enrique S. Quintana-Ortií and Vicente Hernández Parallel Partial Stabilizing Algorithms for Large Linear Control Systems . . . . 193--206 Daisuke Takahashi and Yasumasa Kanada High-Performance Radix-$2$, $3$ and $5$ Parallel $1$-D Complex FFT Algorithms for Distributed-Memory Parallel Computers . . . . . . . . . . . . . . . 207--228
Kuei-Ping Shih and Jang-Ping Sheu and Chua-Huang Huang Statement-Level Communication-Free Partitioning Techniques for Parallelizing Compilers . . . . . . . . 243--269 Thomas Rauber and Gudula Rünger Deriving Array Distributions by Optimization Techniques . . . . . . . . 271--293 Taesoon Park and Heon Y. Yeom A Low Overhead Logging Scheme for Fast Recovery in Distributed Shared Memory Systems . . . . . . . . . . . . . . . . 295--320 Jan-Jan Wu An Interleaving Transformation for Parallelizing Reductions for Distributed-Memory Parallel Machines . . 321--339
D. R. Avresky and B. W. Johnson and F. F. Lombardi Embedded Fault-Tolerant Systems --- Introduction . . . . . . . . . . . . . . 5--6 Douglas W. Caldwell and David A. Rennels A Minimalist Fault-Tolerant Microcontroller Design for Embedded Spacecraft Computing . . . . . . . . . . 7--25 Arun K. Somani and Allen M. Sansano Achieving Robustness and Minimizing Overhead in Parallel Algorithms Through Overlapped Communication/Computation . . 27--52 Joshua Haines and Vijay Lakamraju and Israel Koren and C. Mani Krishna Application-Level Fault Tolerance as a Complement to System-Level Fault Tolerance . . . . . . . . . . . . . . . 53--68 Igor Lyubashevskiy and Volker Strumpen Fault-Tolerant File-I/O for Portable Checkpointing Systems . . . . . . . . . 69--92 Guillermo A. Alvarez and Flaviu Cristian Simulation-based Testing of Communication Protocols for Dependable Embedded Systems . . . . . . . . . . . . 93--116 Kia Makki and John Dell and Niki Pissinou and W. Melody Moh and Xiaohua Jia Using Logical Rings to Solve the Distributed Mutual Exclusion Problem with Fault Tolerance Issues . . . . . . 117--132
H. Scott Hinton Introduction to Focus Articles on Optical Computing . . . . . . . . . . . 147--147 Guoping Liu and Kyungsook Y. Lee and Harry F. Jordan $n$-Dimensional Processor Arrays with Optical dBuses . . . . . . . . . . . . . 149--163 Carl E. Love and Harry F. Jordan Oscillatory Metastability in Optical Network Synchronizer Circuits . . . . . 165--176 Dietmar Fey and Marko Degenkolb Digit Pipelined Arithmetic for $3$-D Massively Parallel Optoelectronic Circuits . . . . . . . . . . . . . . . . 177--196 Sunil Kim and Alexander V. Veidenbaum On Interaction between Interconnection Network Design and Latency Hiding Techniques in Multiprocessors . . . . . 197--216 Takahiro Koita and Tetsuro Katayama and Keizo Saisho and Akira Fukuda Memory Conscious Scheduling for Cluster-based NUMA Multiprocessors . . . 217--235 Kuo-Pao Fan and Chung-Ta King Turn Grouping for Multicast in Wormhole-Routed Mesh Networks Supporting the Turn Model . . . . . . . . . . . . . 237--260
Hongzheng Li and Hongchi Shi A Parallel Algorithm for Volume Projections on SIMD Mesh-Connected Computers . . . . . . . . . . . . . . . 5--22 Ching-Hsien Hsu and Yeh-Ching Chung and Chyi-Ren Dow Efficient Methods for Multi-Dimensional Array Redistribution . . . . . . . . . . 23--46 Sandeep K. S. Gupta and Pradip K. Srimani UpdateSearch: a New Dynamic Channel Allocation Scheme for Mobile Networks That Can Adjust to System Loads . . . . 47--65 Kenneth Hoganson Mapping Parallel Application Communication Topology to Rhombic Overlapping-Cluster Multiprocessors . . 67--90 Don-Lin Yang and Yeh-Ching Chung and Chih-Chang Chen and Ching-Jung Liao A Dynamic Diffusion Optimization Method for Irregular Finite Element Graph Partitioning . . . . . . . . . . . . . . 91--110
A. Averbuch and B. Epstein and L. Ioffe and I. Yavneh Efficient Parallelization of a Three-Dimensional Navier--Stokes Solver on MIMD Multiprocessors . . . . . . . . 123--142 Yosi Ben-Asher and Esti Stein Basic Results in Automatic Transformations of Shared Memory Parallel Programs into Sequential Programs . . . . . . . . . . . . . . . . 143--165 N. Melab and E.-G. Talbi and S. Petiton A Parallel Adaptive Gauss--Jordan Algorithm . . . . . . . . . . . . . . . 167--185 Chih-Yung Chang and Tzung-Shi Chen and Jang-Ping Sheu Improving Memory Traffic by Assembly-Level Exploitation of Reuses for Vector Registers . . . . . . . . . . 187--204 Kuei-Ping Shih and Jang-Ping Sheu and Chih-Yung Chang Efficient Address Generation for Affine Subscripts in Data-Parallel Programs . . 205--227
Beniamino Di Martino and Antonino Mazzeo Introduction: Techniques and Tools for Parallel and Distributed Program Analysis, Development and Run-Time Support . . . . . . . . . . . . . . . . 243--244 Christoph W. Keßler NestStep: Nested Parallelism and Virtual Shared Memory for the BSP Model . . . . 245--262 Gerardo Bandera and Manuel Ujaldón and Emilio L. Zapata Compile and Run-Time Support for the Parallelization of Sparse Matrix Updating Algorithms . . . . . . . . . . 263--276 Roland Wismüller and Thomas Ludwig Interoperable Run-Time Tools for Distributed Systems --- a Case Study . . 277--289 Sergi Girona and Jesús Labarta Sensitivity of Performance Prediction of Message Passing Programs . . . . . . . . 291--298 Rocco Aversa and Beniamino Di Martino and Nicola Mazzocca and Umberto Villano Reducing Parallel Program Simulation Complexity by Static Analysis . . . . . 299--310 B. Chapman and J. Merlin and D. Pritchard and F. Bodin and Y. Mevel and T. Sòrevik and L. Hill Program Development Tools for Clusters of Shared Memory Multiprocessors . . . . 311--322
Henry J. Alme and Garry H. Rodrigue and George B. Zimmerman Domain Decomposition Models for Parallel Monte Carlo Transport . . . . . . . . . 5--23 Woei-Jiunn Tsaur and Shi-Jinn Horng Auditing Causal Relationships of Group Multicast Communications in Group-Oriented Distributed Systems . . . 25--45 John P. Morrison and James J. Kennedy and David A. Power WebCom: a Web Based Volunteer Computer 47--61 Hung-Chang Hsiao and Chung-Ta King Exploiting Network Locality for CC-NUMA Multiprocessors . . . . . . . . . . . . 63--87 P. Pereira and L. Heutte and Y. Lecourtier Source-to-Source Instrumentation for the Optimization of an Automatic Reading System . . . . . . . . . . . . . . . . . 89--104
Hong Shen and Sam Lor and Piyush Maheshwari An Architecture-Independent Graphical Tool for Automatic Contention-Free Process-to-Processor Mapping . . . . . . 115--139 Patricia González and José C. Cabaleiro and Tomás F. Pena Parallel Computation of Wavelet Transforms Using the Lifting Scheme . . 141--152 M. B. Hadim and I. Sakho The Multi-Level Communication: Efficient Routing for Interconnection Networks . . 153--177 Hojung Cha and Dongho Lee H-BSP: a Hierarchical BSP Computation Model . . . . . . . . . . . . . . . . . 179--200 Don-Lin Yang and Jen-Chih Yu and Yeh-Ching Chung Efficient Compositing Methods for the Sort-Last-Sparse Parallel Volume Rendering System on Distributed Memory Multicomputers . . . . . . . . . . . . . 201--220
Erik Elmroth and Chris Ding and Yu-Shu Wu High Performance Computations for Large Scale Simulations of Subsurface Multiphase Fluid and Heat Flow . . . . . 235--258 Yun He and Chris H. Q. Ding Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications . . . . . . . . 259--277 Hung-Chang Hsiao and Chung-Ta King An Application-Driven Study of Multicast Communication for Write Invalidation . . 279--304 Rong-Guey Chang and Tyng-Ruey Chuang and Jenq Kuen Lee Parallel Sparse Supports for Array Intrinsic Functions of Fortran 90 . . . 305--339
Toomas P. Plaks and Graham M. Megson Engineering of Reconfigurable Hardware/Software Objects . . . . . . . 5--6 Kong Woei Susanto and Tom Melham Formally Analyzed Dynamic Synthesis of Hardware . . . . . . . . . . . . . . . . 7--22 Jonathan P. Bowen and He Jifeng An Approach to the Specification and Verification of a Hardware Compilation Scheme . . . . . . . . . . . . . . . . . 23--39 Dirk Fimmel and Renate Merker Design of Processor Arrays for Reconfigurable Architectures . . . . . . 41--56 Jürgen Teich and Sándor P. Fekete and Jörg Schepers Optimization of Dynamic Hardware Reconfigurations . . . . . . . . . . . . 57--75 Xuejun Liang and Jack Jean and Karen Tomko Data Buffering and Allocation in Mapping Generalized Template Matching on Reconfigurable Systems . . . . . . . . . 77--91 Alistair A. McEwan and Jonathan Saul A High Speed Reconfigurable Firewall Based On Parameterizable FPGA-based Content Addressable Memories . . . . . . 93--103 Jürgen Becker and Manfred Glesner A Parallel Dynamically Reconfigurable Architecture Designed for Flexible Application-Tailored Hardware/Software Systems in Future Mobile Communication 105--127
Jie Wu and Yuanyuan Yang The Postal Network: a Recursive Network for Parameterized Communication Model 143--161 J. Vigo-Aguiar and L. M. Quintales A Parallel ODE Solver Adapted to Oscillatory Problems . . . . . . . . . . 163--171 L. Wang and Ted C. Yang On the Boosting of Instruction Scheduling by Renaming . . . . . . . . . 173--197 Subhash Bhalla The Performance of an Efficient Distributed Synchronization and Recovery Algorithm . . . . . . . . . . . . . . . 199--219 Selim G. Akl and Stefan D. Bruda Improving a Solution's Quality Through Parallel Processing . . . . . . . . . . 221--233
Sundararajan Vedantham and Rajkumar Buyya and Mahbub Hassan Special Issue Guest Editorial: Cluster Computing Using High-Speed Networks . . 247--249 Sang-Hwa Chung and Hyuk-Chul Kwon and Kwang Ryel Ryu and Yoojin Chung and Hankook Jang and Cham-Ah Choi Information Retrieval on an SCI-Based PC Cluster . . . . . . . . . . . . . . . . 251--265 Hyeong-Do Kim and Chang-Sung Jeong Object Clustering for High Performance Parallel Computing . . . . . . . . . . . 267--283 K. A. Hawick and H. A. James Asynchronous Transfer Mode and other Network Technologies for Wide-Area and High-Performance Cluster Computing . . . 285--297 Yu-Kwong Kwok Fault-Tolerant Parallel Scheduling of Tasks on a Heterogeneous High-Performance Workstation Cluster . . 299--314 K. Takeda and N. K. Allsopp and J. C. Hardwick and P. C. Macey and D. A. Nicole and S. J. Cox and D. J. Lancaster An Assessment of MPI Environments for Windows NT . . . . . . . . . . . . . . . 315--323
Nick Antonopoulos and Alex Shafarenko An Active Organisation System for Customised, Secure Agent Discovery . . . 5--35 A. Al-Ayyoub and M. Ould-Khaoua and K. Day On the Performance of Parallel Matrix Factorisation on the Hypermesh . . . . . 37--53 Enrique S. Quintana-Ortí and Gregorio Quintana-Ortí and Maribel Castillo and Vicente Hernández Efficient Algorithms for the Block Hessenberg Form . . . . . . . . . . . . 55--66 Weng-Long Chang and Chih-Ping Chu and Jia-Hwa Wu Communication-Free Alignment for Array References with Linear Subscripts in Three Loop Index Variables or Quadratic Subscripts . . . . . . . . . . . . . . . 67--83
Vassilis Tsaoussidis Editorial Comments . . . . . . . . . . . 99--100 I. Batsiolas and Ioanis Nikolaidis Selective Idling: Experiments in Transport Layer Energy Conservation . . 101--114 Vassilis Tsaoussidis and A. Lahanas and C. Zhang The Wave & Probe Communication Mechanisms 115--135 Michael Mitzenmacher and Rajmohan Rajaraman Towards More Complete Models of TCP Latency and Throughput . . . . . . . . . 137--160 Peter Langendörfer and Rolf Kraemer and Hartmut König Evaluation of Well-Known Protocol Implementation Techniques for Application in Wireless Networks . . . . 161--173 Velibor Markovski and Fei Xue and Ljiljana Trajkovi\'c Simulation and Analysis of Packet Loss in User Datagram Protocol Transfers . . 175--196
Dajin Wang A Low-Cost Fault-Tolerant Structure for the Hypercube . . . . . . . . . . . . . 203--216 San-Yuan Wang and Yu-Chee Tseng and Sze-Yao Ni and Jang-Ping Sheu Circuit-Switched Broadcasting in Multi-Port Multi-Dimensional Torus Networks . . . . . . . . . . . . . . . . 217--241 Minyi Guo and Ikuo Nakata A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers . . . . . . . . . . . . . 243--265
Rod Oldehoeft Introduction to Software for High-Performance Systems Papers from the Symposium of The Los Alamos Computer Science Institute . . . . . . . . . . . 5--6 Keith D. Cooper and Devika Subramanian and Linda Torczon Adaptive Optimizing Compilers for the 21st Century . . . . . . . . . . . . . . 7--22 Steven J. Deitz and Bradford L. Chamberlain and Lawrence Snyder High-level Language Support for User-defined Reductions . . . . . . . . 23--37 Julian Cummings and Michael Aivazis and Ravi Samtaney and Raul Radovitzky and Sean Mauch and Dan Meiron A Virtual Test Facility for the Simulation of Dynamic Response in Materials . . . . . . . . . . . . . . . 39--50 Annette C. Feng and Apu C. Kapadia and Wu-Chun Feng and Geneva G. Belford Packet Spacing: An Enabling Mechanism for Delivering Multimedia Content in Computational Grids . . . . . . . . . . 51--66 Wu-Chun Feng and Mark K. Gardner and Jeffrey R. Hay The MAGNeT Toolkit: Design, Implementation and Evaluation . . . . . 67--79 John Mellor-Crummey and Robert J. Fowler and Gabriel Marin and Nathan Tallent HPCVIEW: a Tool for Top-down Analysis of Node Performance . . . . . . . . . . . . 81--104 Anonymous Erratum: Design and Prototype of a Performance Tool Interface for OpenMP 105--128 Jürgen Becker and Ahmad Alsolaim and Manfred Glesner and Janusz Starzyk A Parallel Dynamically Reconfigurable Architecture Designed for Flexible Application-Tailored Hardware/Software Systems in Future Mobile Communication 132--132
Soon M. Chung and Arindam Chatterjee An Adaptive Parallel Distributive Join Algorithm on a Cluster of Workstations 5--35 Vincent Loechner and Beno\^\it Meister and Philippe Clauss Precise Data Locality Optimization of Nested Loops . . . . . . . . . . . . . . 37--76 Jean-Frédéric Myoupo and David Semé and Ivan Stojmenovic Optimal BSR Solutions to Several Convex Polygon Problems . . . . . . . . . . . . 77--90 Cheng-Zhong Xu and Le Yi Wang and Ngo-Tai Fong Stochastic Prediction of Execution Time for Dynamic Bulk Synchronous Computations . . . . . . . . . . . . . . 91--103
Toomas P. Plaks Guest Editor's Foreword: Engineering of Reconfigurable Systems . . . . . . . . . 115--116 W. Böhm and J. Hammes and B. Draper and M. Chawathe and C. Ross and R. Rinker and W. Najjar Mapping a Single Assignment Programming Language to Reconfigurable Systems . . . 117--130 Yajun Ha and Serge Vernalde and Partrick Schaumont and Marc Engels and Rudy Lauwereins and Hugo De Man Building a Virtual Framework for Networked Reconfigurable Hardware and Software Objects . . . . . . . . . . . . 131--144 Michael Eisenring and Marco Platzner A Framework for Run-time Reconfigurable Systems . . . . . . . . . . . . . . . . 145--159 Toomas P. Plaks Configuring of Algorithms in Mapping into Hardware . . . . . . . . . . . . . 161--177 Andreas Koch Compilation for Adaptive Computing Systems Using Complex Parameterized Hardware Objects . . . . . . . . . . . . 179--190 Tzung-Shi Chen and Chih-Yung Chang Skewed Data Partition and Alignment Techniques for Compiling Programs on Distributed Memory Multicomputers . . . 191--211
Geyong Min and Mohamed Ould-Khaoua A Comparative Study of Switching Methods in Multicomputer Networks . . . . . . . 227--238 Abdur Chowdhury and Ophir Frieder and Peng-Jun Wan On the Design, Development, Deployment, and Network Survivability Analysis of the Dynamic Routing System Protocol . . 239--256 Mahmut Kandemir and Alok Choudhary and J. Ramanujam An I/O-Conscious Tiling Strategy for Disk-Resident Data Sets . . . . . . . . 257--284 Shaharuddin Salleh and Bahrom Sanugi and Hishamuddin Jamaluddin and Stephan Olariu and Albert Y. Zomaya Enhanced Simulated Annealing Technique for the Single-Row Routing Problem . . . 285--302
Anonymous Guest Editors' Editorial . . . . . . . . 5--6 Christopher Stone and Suresh Menon Parallel Simulations of Swirling Turbulent Flames . . . . . . . . . . . . 7--28 Sha Meng and Xin Kai Li and Gwynne Evans Numerical Simulation of Oldroyd-B Fluid in a Contraction Channel . . . . . . . . 29--43 Kwai L. Wong and A. J. Baker A Modular Collaborative Parallel CFD Workbench . . . . . . . . . . . . . . . 45--53 James S. Pascoe and Roger J. Loader and Vaidy S. Sunderam Collaborative Group Membership . . . . . 55--68 Alexei S. Rodionov and Hyunseung Choo and Hee Y. Youn Process Simulation Using Randomized Markov Chain and Truncated Marginal Distribution . . . . . . . . . . . . . . 69--85 Marina L. Gavrilova On a Nearest-Neighbor Problem Under Minkowski and Power Metrics for Large Data Sets . . . . . . . . . . . . . . . 87--98 Bernard M. E. Moret and David A. Bader and Tandy Warnow High-Performance Algorithm Engineering for Computational Phylogenetics . . . . 99--111 Chih Jeng Kenneth Tan Solving Systems of Linear Equations with Relaxed Monte Carlo Method . . . . . . . 113--123
Hamid Sarbazi-Azad and Mohamed Ould-Khaoua and Lewis M. Mackenzie A Performance Model of Adaptive Wormhole Routing in $k$-Ary $n$-Cubes in the Presence of Digit-Reversal Traffic . . . 139--159 Vladimir Shurbanov and Dimiter Avresky and Pankaj Mehra and W. Watson Flow Control in ServerNet\reg Clusters 161--173 Jinho Ahn and Sung-Gi Min and Chong-Sun Hwang and Heonchang Yu Efficient Garbage Collection Schemes for Causal Message Logging with Independent Checkpointing . . . . . . . . . . . . . 175--196 Chih-Yung Chang and Jang-Ping Sheu and Hsi-Chiuen Chen Reducing Cache Conflicts by Multi-Level Cache Partitioning and Array Elements Mapping . . . . . . . . . . . . . . . . 197--219
Alan J. Wallcraft A Comparison of Co-Array Fortran and OpenMP Fortran for SPMD Programming . . 231--250 Hyun-Gyoo Yook and Myong-Soon Park Scheduling GEN\_BLOCK Array Redistribution . . . . . . . . . . . . . 251--267 Hazem M. Bahig and Sameh S. Daoud and Mahmoud K. A. Khairat Parallel Self-Index Integer Sorting . . 269--275 Ching-Feng Lin and Don-Lin Yang and Yeh-Ching Chung Parallel Shear-Warp Factorization Volume Rendering Using Efficient $1$-D and $2$-D Partitioning Schemes for Distributed Memory Multicomputers . . . 277--302
Rod Oldehoeft Introduction to Software for High-Performance Systems Papers from the Symposium of The Los Alamos Computer Science Institute . . . . . . . . . . . 5--6 Keith D. Cooper and Devika Subramanian and Linda Torczon Adaptive Optimizing Compilers for the 21st Century . . . . . . . . . . . . . . 7--22 Steven J. Deitz and Bradford L. Chamberlain and Lawrence Snyder High-level Language Support for User-defined Reductions . . . . . . . . 23--37 Julian Cummings and Michael Aivazis and Ravi Samtaney and Raul Radovitzky and Sean Mauch and Dan Meiron A Virtual Test Facility for the Simulation of Dynamic Response in Materials . . . . . . . . . . . . . . . 39--50 Annette C. Feng and Apu C. Kapadia and Wu-Chun Feng and Geneva G. Belford Packet Spacing: An Enabling Mechanism for Delivering Multimedia Content in Computational Grids . . . . . . . . . . 51--66 Wu-Chun Feng and Mark K. Gardner and Jeffrey R. Hay The MAGNeT Toolkit: Design, Implementation and Evaluation . . . . . 67--79 John Mellor-Crummey and Robert J. Fowler and Gabriel Marin and Nathan Tallent HPCVIEW: a Tool for Top-down Analysis of Node Performance . . . . . . . . . . . . 81--104 Bernd Mohr and Allen D. Malony and Sameer Shende and Felix Wolf Design and Prototype of a Performance Tool Interface for OpenMP . . . . . . . 105--128 Anonymous Erratum . . . . . . . . . . . . . . . . 132--132
Paul A. Gray and Vaidy S. Sunderam Collaborative Metacomputing with IceT 139--166 Jos de Kloe and Aad van der Steen and Hakan Öksüzo\uglu and Henk Dijkstra A Fully Implicit Parallel Ocean Model Using MUMPS . . . . . . . . . . . . . . 167--183 G. Ivan Maldonado and Naveen Kondapalli Online Higher-Order Error Correction of Nonlinear Diffusion Generalized Perturbation Theory Using Neural Networks . . . . . . . . . . . . . . . . 185--192 Amitava Datta Efficient Graph-Theoretic Algorithms on a Linear Array with a Reconfigurable Pipelined Bus System . . . . . . . . . . 193--211 Anonymous Contributing Authors . . . . . . . . . . 213--214
Peter Langendörfer Editorial Comments . . . . . . . . . . . 223--224 Jianping Pan and Jon W. Mark and Sherman X. Shen TCP Performance and Behaviors with Local Retransmissions . . . . . . . . . . . . 225--244 Peter Langendörfer and Michael Methfessel and Horst Frankenfeldt and Irina Babanskaja and Irina Matthaei and Rolf Kraemer Shielding TCP from Wireless Link Errors: Retransmission Effort and Fragmentation 245--260 A. Lahanas and V. Tsaoussidis Experimental Evaluation of TCP-Probing in Mobile Networks . . . . . . . . . . . 261--279 D. Vardalis and V. Tsaoussidis Efficiency/Fairness Tradeoffs in Networks with Wireless Components and Transient Congestion . . . . . . . . . . 281--296
R. Aversa and N. Mazzocca and U. Villano A Case Study of Application Analytical Modeling in Heterogeneous Computing Environments: Cholesky Factorization in a NOW . . . . . . . . . . . . . . . . . 5--24 Jaeheung Yeo and Heon Y. Yeom and Taesoon Park An Asynchronous Protocol for Release Consistent Distributed Shared Memory Systems . . . . . . . . . . . . . . . . 25--41 P. M. W. Knijnenburg and T. Kisuki and M. F. P. O'Boyle Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation . . . . . . . . . . . . . . 43--67 Jae Heon Yun Parallel Performance of Block ILU Preconditioners for a Block-tridiagonal Matrix . . . . . . . . . . . . . . . . . 69--89 Issam Damaj and Hassan Diab Performance Analysis of Linear Algebraic Functions Using Reconfigurable Computing 91--107
Farhad Arbab and Khalil Drira Introduction . . . . . . . . . . . . . . 119--120 K. De Bosschere and J.-M. Jacquet Suspension Terms as a Means for Meta-coordination in the $ \mu $ Log Coordination Framework . . . . . . . . . 121--129 Hans De Bruin BCOOPL: a Language for Controlling Component Interactions . . . . . . . . . 131--139 Paola Inverardi and Henry Muccini Software Architectures and Coordination Models . . . . . . . . . . . . . . . . . 141--149 Marie-Claude Pellegrini and Michel Riveill Component Management in a Dynamic Architecture . . . . . . . . . . . . . . 151--159 H. Tellio\uglu Modeling Coordinated Work: Definition and Application of the Model ``Coordinated Work Environment'' . . . . 161--171 Géraldine Texier and Noël Plouzeau Automatic Management of Sessions in Shared Spaces . . . . . . . . . . . . . 173--181 H.-P. Dommel and J. J. Garcia-Luna-Aceves Efficient Group Coordination in Multicast Trees . . . . . . . . . . . . 183--192 Kamel Barkaoui and Rabah Benamara On Concurrency Control in Multidatabase Systems with an Extended Transaction Model . . . . . . . . . . . . . . . . . 193--202 Omar Rafiq and Leo Cacciari Coordination Algorithm for Distributed Testing . . . . . . . . . . . . . . . . 203--211
Yen-Chun Lin and Hong Shen Foreword . . . . . . . . . . . . . . . . 227--228 Surin Kittitornkun and Yu Hen Hu Processor Array Synthesis from Shift-Variant Deep Nested Do Loops . . . 229--249 Jie Li and Yi Pan and Hong Shen More Efficient Topological Sort Using Reconfigurable Optical Buses . . . . . . 251--258 Wen-Yang Lin Finding Optimal Ordering of Sparse Matrices for Column-Oriented Parallel Cholesky Factorization . . . . . . . . . 259--277 Yen-Chun Lin and Yao-Hsien Hsu and Chun-Keng Liu Constructing $ H4 $, a Fast Depth-Size Optimal Parallel Prefix Circuit . . . . 279--304 Feng-Tse Lin and Jing-Shing Yao Fuzzy Critical Path Method Based on Signed-Distance Ranking and Statistical Confidence-Interval Estimates . . . . . 305--325 John Sum and Hong Shen and G. Young and Jie Wu and Chi-Sing Leung Analysis on Extended Ant Routing Algorithms for Network Routing and Management . . . . . . . . . . . . . . . 327--340
Gene Eu Jan and Yuan-Shin Hwang An Efficient Algorithm for Perfect Load Balancing on Hypercube Multiprocessors 5--15 Gwan-Hwan Hwang and Cheng-Wei Chen and Jenq Kuen Lee and Roy Dz-Ching Ju Segmented Alignment: An Enhanced Model to Align Data Parallel Programs of HPF 17--41 M. Arif Wani and Hamid R. Arabnia Parallel Edge-Region-Based Segmentation Algorithm Targeted at Reconfigurable MultiRing Network . . . . . . . . . . . 43--62 Yu-Kwong Kwok On Exploiting Heterogeneity for Cluster Based Parallel Multithreading Using Task Duplication . . . . . . . . . . . . . . 63--72 Hojung Cha and Rhan Ha and Jane W. S. Liu Experimental Analysis of Timing Validation Methods for Distributed Real-Time Systems . . . . . . . . . . . 73--94
George A. Gravvanis Guest Editor's Editorial Computational Science and Scientific Computing . . . . 107--108 Thomas Huckle Factorized Sparse Approximate Inverses for Preconditioning . . . . . . . . . . 109--117 George A. Gravvanis On the Solution of Boundary Value Problems by Using Fast Generalized Approximate Inverse Banded Matrix Techniques . . . . . . . . . . . . . . . 119--129 Dale Shires and Ram Mohan Optimization and Performance of a Fortran 90 MPI-Based Unstructured Code on Large-Scale Parallel Systems . . . . 131--141 Andreas Gerndt and Thomas Van Reimersdahl and Torsten Kuhlen and Christian Bischof and Ingolf Hörschler and Matthias Meinke and Wolfgang Schröder Large-Scale CFD Data Handling in a VR-Based Otorhinolaryngological CAS-System using a Linux-Cluster . . . . 143--154 Eunice E. Santos Parallel Complexity of Matrix Multiplication . . . . . . . . . . . . . 155--175 Marina L. Gavrilova and Muhammad H. Alsuwaiyel Computing the Euclidean Distance Transform on a Linear Array of Processors . . . . . . . . . . . . . . . 177--185
Minyi Guo and Yi Pan and Zhen Liu Symbolic Communication Set Generation for Irregular Parallel Applications . . 199--214 W. Zhu Cluster Queue Structure for Shared-Memory Multiprocessor Systems . . 215--236 Esin Onbasçio\uglu and Linet Özdamar Optimization of Data Distribution and Processor Allocation Problem Using Simulated Annealing . . . . . . . . . . 237--253 Zhen Jiang and Jie Wu Fault-Tolerant Broadcasting in $2$-D Wormhole-Routed Meshes . . . . . . . . . 255--275
Anonymous Guest Editor Introduction for the Special Section on Commercial Applications for High-Performance Computing . . . . . . . . . . . . . . . 5--6 Terry A. Braun and Todd E. Scheetz and Gregg Webster and Abe Clark and Edwin M. Stone and Val C. Sheffield and Thomas L. Casavant Identifying Candidate Disease Genes with High-Performance Computing . . . . . . . 7--24 Mathew S. Thoennes and Charles C. Weems Exploration of the Performance of a Data Mining Application via Hardware Based Monitoring . . . . . . . . . . . . . . . 25--42 Ruppa K. Thulasiram and Parimala Thulasiraman Performance Evaluation of a Multithreaded Fast Fourier Transform Algorithm for Derivative Pricing . . . . 43--58 Hyun-Wook Jin and Chuck Yoo and Jin-Young Choi Firmware-Level Latency Analysis on a Gigabit Network . . . . . . . . . . . . 59--75 G. Min and M. Ould-Khaoua Communication Delay in Wormhole-Switched Tori Networks under Bursty Workloads . . 77--94
Toomas P. Plaks and Peter M. Athanas Engineering of Configurable Systems: Guest Editors' Foreword . . . . . . . . 107--108 Christian Plessl and Marco Platzner Instance-Specific Accelerators for Minimum Covering . . . . . . . . . . . . 109--129 Maya Gokhale and Jan Frigo and Kevin Mccabe and James Theiler and Christophe Wolinski and Dominique Lavenier Experience with a Hybrid Processor: $K$-Means Clustering . . . . . . . . . . 131--148 Marcus Bednara and Jürgen Teich Automatic Synthesis of FPGA Processor Arrays from Loop Algorithms . . . . . . 149--165 V. Baumgarte and G. Ehlers and F. May and A. Nückel and M. Vorbach and M. Weinhardt PACT XPP --- a Self-Reconfigurable Data Processing Architecture . . . . . . . . 167--184 Claudia Kretzschmar and Robert Siegmund and Dietmar Müller Low Power Encoding Techniques for Dynamically Reconfigurable Hardware . . 185--203
Toomas P. Plaks and Peter M. Athanas Engineering of Configurable Systems, II. Guest Editors' Foreword . . . . . . . . 219--220 Stefan Janson and Daniel Merkle and Martin Middendorf and Hossam Elgindy and Hartmut Schmeck On Enforced Convergence of ACO and its Implementation on the Reconfigurable Mesh Architecture Using Size Reduction Tasks . . . . . . . . . . . . . . . . . 221--238 Kiran Puttegowda and David I. Lehn and Jae H. Park and Peter Athanas and Mark Jones Context Switching in a Run-Time Reconfigurable System . . . . . . . . . 239--257 Seonil Choi and Ju-wook Jang and Sumit Mohanty and Viktor K. Prasanna Domain-Specific Modeling for Rapid Energy Estimation of Reconfigurable Architectures . . . . . . . . . . . . . 259--281 Paul Heysters and Gerard Smit and Egbert Molenkamp A Flexible and Energy-Efficient Coarse-Grained Reconfigurable Architecture for Mobile Systems . . . . 283--308 Christian Siemers and Volker Winterstein The Universal Configurable Block/Machine --- An Approach for a Configurable SoC-Architecture . . . . . . . . . . . . 309--331 Anonymous Erratum . . . . . . . . . . . . . . . . 332--332
Baback A. Izadi and Füsun Özgüner An Augmented $k$-ary Tree Multiprocessor with Real-Time Fault-Tolerant Capability 5--17 David Er-El and Dror G. Feitelson Communication Models for a Free-Space Optical Cross-Connect Switch . . . . . . 19--48 S. Peigin and B. Epstein and T. Rubin and S. Seror Parallel Large Scale High Accuracy Navier--Stokes Computations on Distributed Memory Clusters . . . . . . 49--68 Subhash Bhalla and Stuart E. Madnick Asynchronous Backup and Initialization of a Database Server for Replicated Database Systems . . . . . . . . . . . . 69--89
Constantine Katsinis and Bahram Nabet A Scalable Interconnection Network Architecture for Petaflops Computing . . 103--128 Yi Zhang and Steve Haga and Rajeev Barua Execution History Guided Instruction Prefetching . . . . . . . . . . . . . . 129--147 Lap-sun Cheung and Yu-kwok Kwok On Load Balancing Approaches for Distributed Object Computing Systems . . 149--175 Oliver Sinnen and Leonel Sousa On Task Scheduling Accuracy: Evaluation Methodology and Results . . . . . . . . 177--194 J. Vigo-aguiar and S. Natesan A Parallel Boundary Value Technique for Singularly Perturbed Two-Point Boundary Value Problems . . . . . . . . . . . . . 195--206
Qing Yi and Ken Kennedy and Vikram Adve Transforming Complex Loop Nests for Locality . . . . . . . . . . . . . . . . 219--264 S. Loucif and M. Ould-khaoua Modeling Latency in Deterministic Wormhole-Routed Hypercubes under Hot-Spot Traffic . . . . . . . . . . . . 265--278 Chung-hsing Hsu and Ulrich Kremer A Quantitative Analysis of Tile Size Selection Algorithms . . . . . . . . . . 279--294 Hong-soog Kim and Young-ha Yoon and Dong-soo Han Parallel Processing of First Order Linear Recurrence on SMP Machines . . . 295--310
Teofilo F. Gonzalez Efficient Resource Utilization in Parallel and Distributed Systems . . . . 5--6 G. E. Suh and L. Rudolph and S. Devadas Dynamic Partitioning of Shared Cache Memory . . . . . . . . . . . . . . . . . 7--26 Singling Lee and Hann-jang Ho and Wen-wei Mai An Efficient Scheduling Algorithm for Information Delivery on VoD System . . . 27--41 Eliezer Dekel and Gera Goft ITRA: Inter-Tier Relationship Architecture for End-to-end QoS . . . . 43--70 Yamin Li and Shietung Peng and Wanming Chu Efficient Collective Communications in Dual-Cube . . . . . . . . . . . . . . . 71--90 Bryan Chan and Tarek S. Abdelrahman Run-Time Support for the Automatic Parallelization of Java Programs . . . . 91--117 Anonymous Contributing authors . . . . . . . . . . 119--123 Anonymous Information for authors . . . . . . . . 125--127
Marina L. Gavrilova and C. J. Kenneth Tan Guest Editors' Editorial: Special Issue: Iterative Solving Environments and Optimization Techniques for Scientific Applications . . . . . . . . . . . . . . 131--133 Margreet Nool and Michael M. J. Proot Parallel Implementation of a Least-Squares Spectral Element Solver for Incompressible Flow Problems . . . . 135--148 Jing-Ru C. Cheng and Paul E. Plassmann A Parallel Particle Tracking Framework for Applications in Scientific Computing 149--164 Marina L. Gavrilova Empirical Studies of Optimization Techniques in the Event-Driven Simulation of Mechanically Alloyed Materials . . . . . . . . . . . . . . . 165--176 V. Blanco and P. González and J. C. Cabaleiro and D. B. Heras and T. F. Pena and J. J. Pombo and F. F. Rivera Performance Prediction for Parallel Iterative Solvers . . . . . . . . . . . 177--191 Mohammad A. Rajabi and J. A. Rod Blais Optimization of DTM Interpolation Using SFS with Single Satellite Imagery . . . 193--213 T. K. Ralphs and L. Ládanyi and M. J. Saltzman A Library Hierarchy for Implementing Scalable Parallel Search Algorithms . . 215--234 Matthew F. Dixon and C. J. Kenneth Tan Using Distributed Computers to Deterministically Approximate Higher Dimensional Convection Diffusion Equations . . . . . . . . . . . . . . . 235--253
Laurence Tianruo Yang Special issue on High Performance Scientific and Engineering Applications 263--264 Xiaolin Li and Manish Parashar Hierarchical Partitioning Techniques for Structured Adaptive Mesh Refinement Applications . . . . . . . . . . . . . . 265--278 A. Bourgeade and B. Nkonga Dynamic Load Balancing Computation of Pulses Propagating in a Nonlinear Medium 279--294 Yi-chang Zhuang and Tyng Yue Liang and Ce-kuen Shieh and Jun-qi Lee and Laurence Tianruo Yang A Group-Based Load Balance Scheme for Software Distributed Shared Memory Systems . . . . . . . . . . . . . . . . 295--309 María J. Martín and Marta Parada and Ramón Doallo High Performance Air Pollution Simulation Using OpenMP . . . . . . . . 311--321 V. Chaudhary and W. L. Hase and H. Jiang and L. Sun and D. Thaker Experiments with Parallelizing Tribology Simulations . . . . . . . . . . . . . . 323--343 Subhash Bhalla Parallel Concurrency Control Activity for Transaction Management in Real-time Database Systems . . . . . . . . . . . . 345--369 Anonymous Table of Contents: Volume 28 (2004) . . 371--372
Virginia Niculescu On Data Distributions in the Construction of Parallel Programs . . . 5--25 Marc Grunberg and Stéphane Genaud and Catherine Mongenet Seismic Ray-Tracing and Earth Mesh Modeling on Various Parallel Architectures . . . . . . . . . . . . . 27--44 Bader F. Albdaiwi and Marilynn L. Livingston Perfect Distance-$d$ Placements in $2$D Toroidal Networks . . . . . . . . . . . 45--57 Jan-jan Wu and Shih-hsien Yeh and Pangfeng Liu Efficient Multiple Multicast on Heterogeneous Network of Workstations 59--88 Selim G. Akl Superlinear Performance in Real-Time Parallel Computation . . . . . . . . . . 89--111
Minyi Guo Editorial: Parallel and Distributed Processing with Applications . . . . . . 123--123 Ching-Hsien Hsu and Kun-Ming Yu A Compressed Diagonals Remapping Technique for Dynamic Data Redistribution on Banded Sparse Matrix 125--143 Laurence Tianruo Yang and Richard P. Brent Parallel MCGLS and ICGLS Methods for Least Squares Problems on Distributed Memory Architectures . . . . . . . . . . 145--156 Hui Wang and Minyi Guo and Daming Wei A Divide-and-Conquer Algorithm for Irregular Redistribution in Parallelizing Compilers . . . . . . . . 157--170 Xuhui Li and Jiannong Cao and Yanxiang He A Direct Execution Approach to Simulating Mobile Agent Algorithms . . . 171--184 Ling Chen and Hongjian Chen and Yi Pan and Yixin Chen A Fast Efficient Parallel Hough Transform Algorithm on LARPBS . . . . . 185--195 Weijia Jia and Bo Han and Chuanlin Zhang and Wanlei Zhou Delay Control and Parallel Admission Algorithms for Real-Time Anycast Flow 197--209 Chain-Wu Lee and Chun-Hsi Huang and Laurence Tianruo Yang and Sanguthevar Rajasekaran Distributed Path-Based Inference in Semantic Networks . . . . . . . . . . . 211--227
S. Peigin and B. Epstein Embedded Parallelization Approach for Optimization in Aerodynamic Design . . . 243--263 Neng-Chung Wang and Chih-Ping Chu and Tzung-Shi Chen Multipath-Based Multicasting Strategies for Wormhole-Routed Star Graph Interconnection Networks . . . . . . . . 265--286 Ramin Nasiri and Ali Moeini and Ahmad Abdollahzadeh A New Approach Towards Procurement of Software Models Via Distributed Business Models . . . . . . . . . . . . . . . . . 287--302 Amitava Datta Multiple Addition and Prefix Sum on a Linear Array with a Reconfigurable Pipelined Bus System . . . . . . . . . . 303--317
Mo Zeyao and Fu Lianxiang Parallel Flux Sweep Algorithm for Neutron Transport on Unstructured Grid 5--17 Po-Jen Chuang and Young-Tzong Hsiao and Yu-Shian Chiu An Efficient Value Predictor Dynamically Using Loop and Locality Properties . . . 19--36 Chao Lin Heuristic Contention-Free Broadcast in Heterogeneous Networks of Workstations with Multiple Send and Receive Speeds 37--64 Anonymous Contributing Authors . . . . . . . . . . 65--65
George A. Gravvanis and Eunice E. Santos Guest Editorial: Special Issue: Special Issue on Parallel and Distributed Processing and Applications . . . . . . 75--76 G. A. Gravvanis and K. M. Giannoutakis and M. P. Bekakos and O. B. Efremides Parallel and Systolic Solution of Normalized Explicit Approximate Inverse Preconditioning . . . . . . . . . . . . 77--96 Eunice E. Santos Optimal and Efficient Parallel Tridiagonal Solvers Using Direct Methods 97--115 Shuting Xu and Jun Zhang A Parallel Hybrid Web Document Clustering Algorithm and its Performance Study . . . . . . . . . . . . . . . . . 117--131 James M. Baker, Jr. and Brian Gold and Mark Bucciero and Sidney Bennett and Rajneesh Mahajan and Priyadarshini Ramachandran and Jignesh Shah SCMP: a Single-Chip Message-Passing Parallel Computer . . . . . . . . . . . 133--149 Tatiana Tabirca and Len Freeman and Sabin Tabirca and Laurence Tianruo Yang Feedback Guided Dynamic Loop Scheduling: Convergence of the Continuous Case . . . 151--178 Ilias G. Maglogiannis and Elias P. Zafiropoulos and Agapios N. Platis and George A. Gravvanis Computing the Success Factors in Consistent Acquisition and Recognition of Objects in Color Digital Images by Explicit Preconditioning . . . . . . . . 179--198 Anonymous Contributing Authors . . . . . . . . . . 199--200
Toomas P. Plaks Foreword: Engineering of Configurable Systems, III . . . . . . . . . . . . . . 211--212 Soheil Ghiasi and Hyun J. Moon and Ani Nahapetian and Majid Sarrafzadeh Collaborative and Reconfigurable Object Tracking . . . . . . . . . . . . . . . . 213--238 William D. Smith and Austars R. Schnore Towards an RCC-Based Accelerator for Computational Fluid Dynamics Applications . . . . . . . . . . . . . . 239--261 Gerard K. Rauwerda and Paul M. Heysters and Gerard J. M. Smit Mapping Wireless Communication Algorithms onto a Reconfigurable Architecture . . . . . . . . . . . . . . 263--282 Linda Kaouane and Mohamed Akil and Thierry Grandpierre and Yves Sorel A Methodology to Implement Real-Time Applications onto Reconfigurable Circuits . . . . . . . . . . . . . . . . 283--301 Anonymous Contributing Authors . . . . . . . . . . 303--305
Yuan Zhao and Ken Kennedy Scalarization Using Loop Alignment and Loop Skewing . . . . . . . . . . . . . . 5--46 Steve Haga and Natasha Reeves and Rajeev Barua and Diana Marculescu Dynamic Functional Unit Assignment for Low Power . . . . . . . . . . . . . . . 47--62 Jie Wu and Li Sheng Deadlock-Free Multicasting in Irregular Networks Using Prefix Routing . . . . . 63--78 Farhad Kamangar and David Levine and Gergely V. Záruba and Renjith Thomas Mobile Agent Connection Establishment and Management (CEMA) --- Message Exchange for Pervasive Computing Environments . . . . . . . . . . . . . . 79--99 Anonymous Contributing Authors . . . . . . . . . . 101--103
Weng-Long Chang and Chih-Ping Chu and Jia-Hwa Wu A Polynomial-Time Dependence Test for Determining Integer-Valued Solutions in Multi-Dimensional Arrays Under Variable Bounds . . . . . . . . . . . . . . . . . 111--135 Wei-Keng Liao and Alok Choudhary and Donald Weiner and Pramod Varshney Performance Evaluation of a Parallel Pipeline Computational Model for Space-Time Adaptive Processing . . . . . 137--160 Ramzi Basharahil and Brian Wims and Cheng-Zhong Xu and Song Fu Distributed Shared Arrays: An Integration of Message Passing and Multithreading on SMP Clusters . . . . . 161--184 Hamid Reza Zarandi and Hamid Sarbazi-Azad Hierarchical Binary Set Partitioning in Cache Memories . . . . . . . . . . . . . 185--202
Djibo Karimou and Jean Frédéric Myoupo An Application of an Initialization Protocol to Permutation Routing in a Single-Hop Mobile Ad Hoc Networks . . . 215--226 N. Sklavos and O. Koufopavlou Implementation of the SHA-2 Hash Family Standard Using FPGAs . . . . . . . . . . 227--248 Ali Kemal Sinop and Tolga Abaci and Ümit Akkus and Attila Gürsoy and Ugur Güdükbay PHR: a Parallel Hierarchical Radiosity System with Dynamic Load Balancing . . . 249--263 Shaharuddin Salleh and Stephan Olariu and Bahrom Sanugi and Mohd Ismail Abd Aziz Single-Row Transformation of Complete Graphs . . . . . . . . . . . . . . . . . 265--279 Anonymous Contributing Authors . . . . . . . . . . 281--282
Thiab R. Taha and Xiangming Xu Parallel Split-Step Fourier Methods for the Coupled Nonlinear Schrödinger Type Equations . . . . . . . . . . . . . . . 5--23 Wonjun Lee and Jaideep Srivastava and Bikash Sabata QoS-Aware Admission Control and Dynamic Resource Provisioning Framework in Ubiquitous Multimedia Computing Environments . . . . . . . . . . . . . . 25--50 Panagiotis D. Michailidis and Konstantinos G. Margaritis New Processor Array Architectures for the Longest Common Subsequence Problem 51--69 S. Parsa and O. Bushehrian The Design and Implementation of a Framework for Automatic Modularization of Software Systems . . . . . . . . . . 71--94 Anonymous Contributing Authors . . . . . . . . . . 95--95
Toomas P. Plaks Engineering of Configurable Systems, IV: Foreword . . . . . . . . . . . . . . . . 103--104 Peter Bellows High-Visibility Debug-By-Design for FPGA Platforms . . . . . . . . . . . . . . . 105--118 Tim Todman and José Gabriel de F. Coutinho and Wayne Luk Customisable Hardware Compilation . . . 119--137 Viktor K. Prasanna Energy-Efficient Computations on FPGAs 139--162 Stephan Gatzka and Christian Hochberger The AMIDAR Class of Reconfigurable Processors . . . . . . . . . . . . . . . 163--181 Anonymous Contributing Authors . . . . . . . . . . 183--184
Ming Zhu and Wentong Cai and Bu-Sung Lee and Xudong Wu A Hybrid Analysis of an Optimization Approach for Cluster Applications . . . 191--215 Masahiro Sowa and Ben A. Abderazek and Tsutomu Yoshinaga Parallel Queue Processor Architecture Based on Produced Order Computation Model . . . . . . . . . . . . . . . . . 217--229 Abdel-Elah Al-Ayyoub Distributed Unidirectional and Bidirectional Heuristic Search: Algorithm Design and Empirical Assessment . . . . . . . . . . . . . . . 231--250 P. Alonso and J. M. Badía and A. M. Vidal An Efficient Parallel Algorithm to Solve Block-Toeplitz Systems . . . . . . . . . 251--278 Anonymous Contributing Authors . . . . . . . . . . 279--280
Youngsong Mun Guest Editorial . . . . . . . . . . . . 5--6 Young-Cheol Bang and Hyunseung Choo On Bandwidth Adjusted Multicast Communications in Pipeline Router Architecture . . . . . . . . . . . . . . 7--18 Hoonmo Yang and Moonkey Lee Embedded Processor Validation Environment Using a Cycle-Accurate Retargetable Instruction-Set Simulator 19--32 Youngsong Mun Performance Analysis of Banyan-Type Multistage Interconnection Networks Under Nonuniform Traffic Pattern . . . . 33--52 Minyeol Lim and Eui-Nam Huh An Efficient Design and Implementation for Grid Advanced Information Service 53--63 Gyung-Leen Park and Hee Youn Yong A New Approach for High Performance Computing Systems with Various Checkpointing Schemes . . . . . . . . . 65--78 Seongjin Ahn and Jin Chung Wook A Study upon SNMP MIB II Algorithm: For the Calculation of the Line Utilization 79--91 Byoung Min Joon Intrusion-Tolerant Server Architecture for Survivable Services . . . . . . . . 93--102 D. S. Kim and Y. J. Jung and T. M. Chung PRISM: a Preventive and Risk-Reducing Integrated Security Management Model Using Security Label . . . . . . . . . . 103--121 Byung-Gi Kim and Sang-Sun Hong Secure Mutual Authentication for Ad hoc Wireless Networks . . . . . . . . . . . 123--132 Joong-Jae Lee and Won Kim and Na-Young Lee and Gye-Young Kim A New Incremental Watermarking Based on Dual-Tree Complex Wavelet Transform . . 133--140 Anonymous Contributing Authors . . . . . . . . . . 141--147
Leonidas Deligiannidis and Robert J. K. Jacob Improving Performance of Virtual Reality Applications Through Parallel Processing 155--173 Ching-Hsien Hsu Sparse Matrix Block-Cyclic Realignment on Distributed Memory Machines . . . . . 175--196 Maria Athanasaki and Aristidis Sotiropoulos and Georgios Tsoukalas and Nectarios Koziris and Panayiotis Tsanakas Hyperplane Grouping and Pipelined Schedules: How to Execute Tiled Loops Fast on Clusters of SMPs . . . . . . . . 197--226 Anonymous Contributing Authors . . . . . . . . . . 227--227
Nen-Chung Wang and Chih-Ping Chu An Efficient Tree-Based Multicasting Algorithm on Wormhole-Routed Star Graph Interconnection Networks Embedded with Hamiltonian Path . . . . . . . . . . . . 5--26 Rose Joshua and David H. Scuse Modeling Beliefs and Solution Strategies in a Distributed Learning System . . . . 27--39 E. M. Garz and I. García Approaches Based on Permutations for Partitioning Sparse Matrices on Multiprocessors . . . . . . . . . . . . 41--61 Ching-Wen Chen and Chung-Ping Chung Designing a Disjoint Paths Interconnection Network with Fault Tolerance and Collision Solving . . . . 63--80 Anonymous Contributing Authors . . . . . . . . . . 81--81
Mohamed Ould-Khaoua and Geyong Min Performance Evaluation of Grid and Cluster Computing Systems . . . . . . . 91--92 David A. Bacigalupo and Stephen A. Jarvis and Ligang He and Daniel P. Spooner and Donna N. Dillenberger and Graham R. Nudd An Investigation into the Application of Different Performance Prediction Methods to Distributed Enterprise Applications 93--111 P. Cicotti and M. Taufer and Andrew A. Chien DGMonitor: a Performance Monitoring Tool for Sandbox-Based Desktop Grid Platforms 113--133 William M. Jones and Walter B. Ligon and Louis W. Pang and Dan Stanzione Characterization of Bandwidth-Aware Meta-Schedulers for Co-Allocating Jobs Across Multiple Clusters . . . . . . . . 135--163 Hai Jin and Xia Xie and Yunfa Li and Zongfen Han and Zhihua Dai and Peng Lu A Real-Time Performance Evaluation Model for Distributed Software with Reliability Constrains . . . . . . . . . 165--179 Mark M. Mathis and Darren J. Kerbyson A General Performance Model of Structured and Unstructured Mesh Particle Transport Computations . . . . 181--199 D. A. Grove and P. D. Coddington Communication Benchmarking and Performance Modelling of MPI Programs on Cluster Computers . . . . . . . . . . . 201--217 Anonymous Contributing Authors . . . . . . . . . . 219--223
Robert L. Grossman and M. Mazzucco and H. Sivakumar and Y. Pan and Q. Zhang Simple Available Bandwidth Utilization Library for High-Speed Wide Area Networks . . . . . . . . . . . . . . . . 231--242 Paulius Micikevicius and Narsingh Deo Cluster Computing for Determining Three-Dimensional Protein Structure . . 243--271 Wei Jie and Terence Hung and Wentong Cai An Information Service for Grid Virtual Organization: Architecture, Implementation and Evaluation . . . . . 273--290 Chun-Yuan Lin and Yeh-Ching Chung and Jen-Shiuh Liu Efficient Data Distribution Schemes for EKMR-Based Sparse Arrays on Distributed Memory Multicomputers . . . . . . . . . 291--313 Chao-Tung Yang and Kuan-Wei Cheng and Kuan-Ching Li An Enhanced Parallel Loop Self-Scheduling Scheme for Cluster Environments . . . . . . . . . . . . . . 315--335
Stefano Avallone and Donato Emma and Antonio Pescap\`e and Giorgio Ventre High Performance Internet Traffic Generators . . . . . . . . . . . . . . . 5--26 Alan Bradley and Kevin Curran and Gerard Parr Discovering Resources in Computational GRID Environments . . . . . . . . . . . 27--49 Shahriar Pourazin and Ahmad AbdollahZadeh Barforoush Concurrent Metareasoning . . . . . . . . 51--64 Nikolaos Drosinos and Nectarios Koziris The Effect of Process Topology and Load Balancing on Parallel Programming Models for SMP Clusters and Iterative Algorithms . . . . . . . . . . . . . . . 65--91 Ling Wang and Yingtao Jiang and Henry Selvaraj Scheduling and Partitioning Schemes for Low Power Designs Using Multiple Supply Voltages . . . . . . . . . . . . . . . . 93--113
Hon-Hing Wan and Yu-Kwong Kwok High Data Rate Video Transmission Using Parallel TCP Connections: Approaches and Performance Evaluation . . . . . . . . . 119--139 Soyeon Park and Seung Ryoul Maeng Log-Based Rollback Recovery without Checkpoints of Shared Memory in Software DSM . . . . . . . . . . . . . . . . . . 141--154 Selim G. Akl and Weiguang Yao A Parallel Approach Eliminates Measurement Perturbations in RLC Circuits . . . . . . . . . . . . . . . . 155--164 Keqiu Li and Hong Shen and Keishi Tajima and Liusheng Huang An Effective Cache Replacement Algorithm in Transcoding-Enabled Proxies . . . . . 165--184 Michalis D. Galanis and Gregory Dimitroulakos and Costas E. Goutis Performance Improvements from Partitioning Applications to FPGA Hardware in Embedded SoCs . . . . . . . 185--199
Mohamed Ould-Khaoua and Geyong Min Performance Modelling and Evaluation of Parallel and Distributed Systems . . . . 203--204 Keqin Li Optimal Period of Workload Redistribution for Dynamic Bulk Synchronous Computations in Heterogeneous Computing Systems . . . . 205--226 Jacques M. Bahi and Sylvain Contassot-Vivier and Raphaël Couturier Performance Comparison of Parallel Programming Environments for Implementing AIAC Algorithms . . . . . . 227--244 Geyong Min and Mohamed Ould-Khaoua and Demetres D. Kouvatsos and Irfan U. Awan Stochastic Analysis of Deterministic Routing Algorithms in the Presence of Self-Similar Traffic . . . . . . . . . . 245--258 Azzedine Boukerche and Caron Dzermajko Scalability and Performance Evaluation of DDM-Based Aggregation/Dissaggregation Protocols for Large-Scale Distributed Interactive Simulations Systems . . . . 259--276 Hao Tian and Ajay K. Katangur and Jiling Zhong and Yi Pan A Novel Multistage Network Architecture with Multicast and Broadcast Capability 277--300 Zhiyong Xu and Laxmi Bhuyan and Yiming Hu Tulip: a New Hash Based Cooperative Web Caching Architecture . . . . . . . . . . 301--320
Yawen Chen and Hong Shen and Fangai Liu Wavelength Assignment for Realizing Parallel FFT on Regular Optical Networks 3--16 Jia Yu and Srikumar Venugopal and Rajkumar Buyya A Market-Oriented Grid Directory Service for Publication and Discovery of Grid Service Providers and their Services . . 17--31 N. Sklavos and P. Kitsos and K. Papadopoulos and O. Koufopavlou Design, Architecture and Performance Evaluation of the Wireless Transport Layer Security . . . . . . . . . . . . . 33--50 Claudio Naoto Fuzitaki and Paulo Blauth Menezes and Júlio Pereira Machado and Fernando D'Andrea Nautilus, a Concurrent Diagrammatic Specification and Programming Language 51--81 Saeed Parsa and Shahriar Lotfi A New Approach to Parallelization of Serial Nested Loops Using Genetic Algorithms . . . . . . . . . . . . . . . 83--94
Rod Oldehoeft Computer science in support of high-performance applications: Papers from the 2004 LACSI Symposium . . . . . 99--100 Cristian Coarfa and Yuri Dotsenko and John Mellor-Crummey Experiences with Sweep3D implementations in Co-array Fortran . . . . . . . . . . 101--121 Christopher D. Rickett and Sung-Eun Choi and Craig E. Rasmussen and Matthew J. Sottile Rapid prototyping frameworks for developing scientific applications: a case study . . . . . . . . . . . . . . . 123--134 Keith D. Cooper and Alexander Grosul and Timothy J. Harvey and Steve Reeves and Devika Subramanian and Linda Torczon and Todd Waterman Exploring the structure of the space of compilation sequences using randomized search algorithms . . . . . . . . . . . 135--151 Sriram Krishnamoorthy and Gerald Baumgartner and Chi-Chung Lam and Jarek Nieplocha and P. Sadayappan Layout transformation support for the disk resident arrays framework . . . . . 153--170 Erik A. Hendriks and Ronald G. Minnich How to build a fast and reliable 1024 node cluster with only one disk . . . . 171--181 Apan Qasem and Ken Kennedy and John Mellor-Crummey Automatic tuning of whole applications using direct search and a performance-based transformation system 183--196
Minyi Guo and Jiannong Cao and Laurence T. Yang Foreword . . . . . . . . . . . . . . . . 199--200 Yih-Fang Lin and Chien-Min Wang and Jan-Jan Wu Optimizing I/O server placement for parallel I/O on switch-based irregular networks . . . . . . . . . . . . . . . . 201--217 Hiroyuki Takizawa and Hiroaki Kobayashi Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing . . . . . 219--234 Bin Xiao and Wei Chen and Yanxiang He A novel approach to detecting DDoS Attacks at an Early Stage . . . . . . . 235--248 Tianqi Wang and Cho-Li Wang and Francis C. M. Lau An architecture to support scalable distributed virtual environment systems on Grid . . . . . . . . . . . . . . . . 249--264 Kin Wah Kwong and Danny H. K. Tsang A congestion-aware search protocol for heterogeneous peer-to-peer networks . . 265--282 Richard S. L. Wu and Allan K. Y. Wong and Tharam S. Dillon CACHE$_{RP}$: a novel dynamic cache size tuning model working with relative object popularity for fast Web information retrieval . . . . . . . . . 283--296 Son Hong Ngo and Xiaohong Jiang and Vinh Trong Le and Susumu Horiguchi Ant-based survivable routing in dynamic WDM networks with shared backup paths 297--307 Percival Xavier and Wentong Cai and Bu-Sung Lee Workload management of cooperatively federated computing clusters . . . . . . 309--322 Wentong Cai and Jiannong Cao and Wei Chen and T. S. Dillon and Minyi Guo and Susumu Horiguchi and Xiaohong Jiang and Hiroaki Kobayashi and Kin Wah Kwong and Francis C. M. Lau and Vinh Trong Le and Bu-Sung Lee and Son Hong Ngo and Hiroyuki Takizawa and Danny Tsang and Cho-Li Wang and Tianqi Wang and Allan K. Y. Wong and Richard S. L. Wu and Percival Xavier and Bin Xiao and Laurence T. Yang and Yanxiang He Contributing Authors . . . . . . . . . . 323--330
Sanjay P. Ahuja and Jack R. Myers A Survey on Wireless Grid Computing . . 3--21 M. R. HoseinyFarahabady and H. Sarbazi-Azad The Grid-Pyramid: a Generalized Pyramid Network . . . . . . . . . . . . . . . . 23--45 Kyungoh Ohn and Haengrae Cho Dynamic Affinity Cluster Allocation in a Shared Disks Cluster . . . . . . . . . . 47--69 Congnan Luo and Anil L. Pereira and Soon M. Chung Distributed Mining of Maximal Frequent Itemsets on a Data Grid System . . . . . 71--90 Hsiao-Hsi Wang and Kuan-Ching Li and Kuo-Jen Wang and Ssu-Hsuan Lu On the Design and Implementation of an Effective Prefetch Strategy for DSM Systems . . . . . . . . . . . . . . . . 91--112
Koushik Sinha and Pradip K. Srimani Deterministic Broadcast and Gossiping Algorithms for Ad hoc Networks . . . . . 115--144 Jyh-Biau Chang and Ce-Kuen Shieh and Tyng-Yeu Liang A Transparent Distributed Shared Memory for Clustered Symmetric Multiprocessors 145--160 Jong-Hoon Youn and Bella Bose and Seungjin Park Fault-Tolerant Routing Algorithm in Meshes with Solid Faults . . . . . . . . 161--177 Athanasios P. Kakarountas and Haralambos Michail and Athanasios Milidonis and Costas E. Goutis and George Theodoridis High-Speed FPGA Implementation of Secure Hash Algorithm for IPSec and VPN Applications . . . . . . . . . . . . . . 179--195 Mehdi Kargahi and Ali Movaghar A Method for Performance Analysis of Earliest-Deadline-First Scheduling Policy . . . . . . . . . . . . . . . . . 197--222 Jyh-Biau Chang and Tyng-Yeu Liang and Ce-Kuen Shieh Contributing Authors . . . . . . . . . . 223--223
Hassan Reza A Methodology for Architectural Design of Concurrent and Distributed Software Systems . . . . . . . . . . . . . . . . 227--248 Saeed Parsa and Shahriar Lotfi A New Genetic Algorithm for Loop Tiling 249--269 Slavko Gajin and Zoran Jovanovi\'c Explanation of Performance Degradation in Turn Model . . . . . . . . . . . . . 271--295 Jih-Woei Huang and Chih-Ping Chu An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution 297--318 M. Ch. Karra and M. P. Bekakos A FPGA-Based Systolic Array Prototype Implementing the Quadrant Interlocking Factorization Method . . . . . . . . . . 319--331
Ben A. Abderazek and Tsutomu Yoshinaga and Masahiro Sowa High-Level Modeling and FPGA Prototyping of Produced Order Parallel Queue Processor Core . . . . . . . . . . . . . 3--15 Michalis D. Galanis and Gregory Dimitroulakos and Costas E. Goutis Partitioning Methodology for Heterogeneous Reconfigurable Functional Units . . . . . . . . . . . . . . . . . 17--34 Junjun Deng and Hengyong Yu and Jun Ni and Tao He and Shiying Zhao and Lihe Wang and Ge Wang A Parallel Implementation of the Katsevich Algorithm for $3$-D CT Image Reconstruction . . . . . . . . . . . . . 35--47 Salvador Petit and Julio Sahuquillo and Ana Pont and David Kaeli Addressing a workload characterization study to the design of consistency protocols . . . . . . . . . . . . . . . 49--72 Ahmet Ugur and Henry Thompson The $p$-sized partitioning algorithm for fast computation of factorials of numbers . . . . . . . . . . . . . . . . 73--82 Jean-Frédéric Myoupo and David Semé Work-efficient BSR-based parallel algorithms for some fundamental problems in graph theory . . . . . . . . . . . . 83--107
A. Al-Ayyoub and A. Awwad and K. Day and M. Ould-Khaoua Generalized methods for algorithm development on optical systems . . . . . 111--125 Eui-Nam Huh and Lonnie R. Welch Adaptive resource management for dynamic distributed real-time applications . . . 127--142 Natali Hritonenko and Yuri Yatsenko Creative destruction of computing systems: analysis and modeling . . . . . 143--154 Junfeng Wang and Hongxia Zhou and Mingtian Zhou and Lei Li A general model for long-tailed network traffic approximation . . . . . . . . . 155--172 J. A. R. Blais and D. A. Provins and M. A. Soofi Spherical harmonic transforms for discrete multiresolution applications 173--187 Li Xiao and Xiaodong Zhang and Zhengqian Kuang and Baiming Feng and Jichang Kang Auto-CFD-NOW: a pre-compiler for effectively parallelizing CFD applications on networks of workstations 189--217
Wu Jigang and Thambipillai Srikanthan Algorithmic aspects of area-efficient hardware/software partitioning . . . . . 223--235 Won W. Ro and Stephen P. Crago and Alvin M. Despain and Jean-Luc Gaudiot Design and evaluation of a hierarchical decoupled architecture . . . . . . . . . 237--259 Nader Mohamed and Jameela Al-Jaroodi and Hong Jiang and David Swanson High-performance message striping over reliable transport protocols . . . . . . 261--278 Bruno Volckaert and Pieter Thysebaert and Marc De Leenheer and Filip De Turck and Bart Dhoedt and Piet Demeester Flexible Grid service management through resource partitioning . . . . . . . . . 279--305 Ching-Wen Chen Design schemes of dynamic rerouting networks with destination tag routing for tolerating faults and preventing collisions . . . . . . . . . . . . . . . 307--326
Ismail Omar Hababeh and Muthu Ramachandran and Nicholas Bowring A high-performance computing method for data allocation in distributed database systems . . . . . . . . . . . . . . . . 3--18 Yanjun Li and Soon M. Chung Parallel bisecting $k$-means with prediction clustering algorithm . . . . 19--37 Sun-Yuan Hsieh An efficient parallel strategy for the perfect domination problem on distance-hereditary graphs . . . . . . . 39--57 Nirved Pandey and G. K. Sharma Startup comparison for message passing libraries with DTM on Linux clusters . . 59--72 Shaharuddin Salleh and Stephan Olariu and Albert Y. Zomaya and Kiew Leh Yieng and Nur Arina B. Aziz Single-row mapping and transformation of connected graphs . . . . . . . . . . . . 73--89
Matteo Frigo and Volker Strumpen The memory behavior of cache oblivious stencil computations . . . . . . . . . . 93--112 Charlie Obimbo and Behzad Salami A parallel algorithm for determining the inverse of a matrix for use in blockcipher encryption/decryption . . . 113--130 Kelvin Yiu-Lun Tsoi and Yu-Kwong Kwok Practical algorithms for scheduling video data in a local area network environment . . . . . . . . . . . . . . 131--147 Yingwu Zhu and Yiming Hu Exploiting client caches to build large Web caches . . . . . . . . . . . . . . . 149--175 Sumir Chandra and Xiaolin Li and Taher Saif and Manish Parashar Enabling scalable parallel implementations of structured adaptive mesh refinement applications . . . . . . 177--203 Jae-Wan Jang and Jin-Soo Kim Design issues and performance comparisons in supporting the sockets interface over user-level communication architecture . . . . . . . . . . . . . . 205--226 Guangming Tan and Shengzhong Feng and Ninghui Sun Cache oblivious algorithms for nonserial polyadic programming . . . . . . . . . . 227--249
Michalis D. Galanis and Gregory Dimitroulakos and Costas E. Goutis Exploring the speedups of embedded microprocessor systems utilizing a high-performance coprocessor data-path 251--271 John D. Holt and Soon M. Chung Parallel mining of association rules from text databases . . . . . . . . . . 273--299 P. W. C. Prasad and Ali Assi and Azam Beg Binary Decision Diagrams and neural networks . . . . . . . . . . . . . . . . 301--320 Chin-Feng Lin and Yeh-Ching Chung and Don-Lin Yang TRLE --- an efficient data compression scheme for image composition of volume rendering on distributed memory multicomputers . . . . . . . . . . . . . 321--345 Hassan Reza and Emanuel S. Grant A method to test concurrent systems using architectural specification . . . 347--357
Jung-Lok Yu and Jin-Soo Kim and Seung-Ryoul Maeng A runtime resolution scheme for priority boost conflict in implicit coscheduling 1--28 Bahman Javadi and Jemal H. Abawajy and Mohammad K. Akbari Analytical modeling of interconnection networks in heterogeneous multi-cluster systems . . . . . . . . . . . . . . . . 29--47 E. I. Milovanovi\'c and I. Z. Milovanovi\'c and M. P. Bekakos and I. N. Tselepis Computing all-pairs shortest paths on a linear systolic array and hardware realization on a reprogrammable FPGA platform . . . . . . . . . . . . . . . . 49--66 Juan A. Acebrón and Renato Spigler Supercomputing applications to the numerical modeling of industrial and applied mathematics problems . . . . . . 67--80 Antonio Plaza and Javier Plaza and David Valencia Impact of platform heterogeneity on the design of parallel algorithms for morphological processing of high-dimensional image data . . . . . . 81--107
Guowen Han and Yuanyuan Yang Scheduling and performance analysis of multicast interconnects . . . . . . . . 109--125 Grigoris Dimitroulakos and Michalis D. Galanis and Costas E. Goutis Design space exploration of an optimized compiler approach for a generic reconfigurable array architecture . . . 127--157 Gyu Sang Choi and Jin-Ha Kim and Deniz Ersoz and Andy B. Yoo and Chita R. Das A comprehensive performance and energy consumption analysis of scheduling alternatives in clusters . . . . . . . . 159--184 Xianyang Jiang and Peiheng Zhang and Xinchun Liu and Stephen S.-T. Yau Survey on index based homology search algorithms . . . . . . . . . . . . . . . 185--212
Chia-Wei Wang and Hung-Chang Hsiao and Wen-Hung Sun and Chung-Ta King and Ming-Tsung Sun Building a tuple space on structured peer-to-peer networks . . . . . . . . . 213--227 Ching-Hsien Hsu and Shih-Chang Chen and Chao-Yang Lan Scheduling contention-free irregular redistributions in parallelizing compilers . . . . . . . . . . . . . . . 229--247 Chao-Tung Yang and Po-Chi Shih and Cheng-Fang Lin and Sung-Yi Chen A resource broker with an efficient network information model on Grid environments . . . . . . . . . . . . . . 249--267 Chao-Tung Yang and I-Hsien Yang and Kuan-Ching Li and Shih-Yu Wang Improvements on dynamic adjustment mechanism in co-allocation data Grid environments . . . . . . . . . . . . . . 269--280 Chung-Kai Chen and Yu-Hao Chang and Yu-Tin Chen and Chih-Chieh Yang and Jenq-Kuen Lee Switching supports for stateful object remoting on network processors . . . . . 281--298 Kuan-Ching Li and Hsun-Chang Chang The design and implementation of visual performance monitoring and analysis toolkit for cluster and Grid environments . . . . . . . . . . . . . . 299--317 Shao-Yang Wang and Rong-Guey Chang Code size reduction by compressing repeated instruction sequences . . . . . 319--331 Jiazheng Zhou and Xuan-Yi Lin and Yeh-Ching Chung Hardware supported multicast in fat-tree-based InfiniBand networks . . . 333--352
Wei Zhong and Gulsah Altun and Xinmin Tian and Robert Harrison and Phang C. Tai and Yi Pan Parallel protein secondary structure prediction schemes using Pthread and OpenMP over hyper-threading technology 1--16 Hyun-Wook Jin and Chuck Yoo Impact of protocol overheads on network throughput over high-speed interconnects: measurement, analysis, and improvement . . . . . . . . . . . . 17--40 Sun-Yuan Hsieh and Huang-Ming Gao On the partial terminal Steiner tree problem . . . . . . . . . . . . . . . . 41--52 Iyad A. Ajwa A case study of Grid Computing and computer algebra: parallel Gröbner Bases and Characteristic Sets . . . . . . . . 53--62 Chun-Yuan Lin and Yeh-Ching Chung Data distribution schemes of sparse arrays on distributed memory multicomputers . . . . . . . . . . . . . 63--87 Sung Hoon Park The election problem in asynchronous distributed systems with bounded faulty processes . . . . . . . . . . . . . . . 89--104
Steve C. Chiu High performance I/O architectures and systems . . . . . . . . . . . . . . . . 105--108 Ken-Ichi Nomura and Rajiv K. Kalia and Aiichiro Nakano and Priya Vashishta and Jorge L. Landa Parallel history matching and associated forecast at the center for interactive smart oilfield technologies . . . . . . 109--117 Yaohang Li and Daniel Chen and Xiaohong Yuan Trustworthy remote compiling services for Grid-based scientific applications 119--131 Tiki L. Suarez Access Grid technology in classroom and research environments . . . . . . . . . 133--145 Hong Jin and Danli Wang and Hongan Wang and Hui Wang Feedback fuzzy-DVS scheduling of control tasks . . . . . . . . . . . . . . . . . 147--162 Steve C. Chiu and Alok N. Choudhary and Danli Wang Network and device-level impacts: performance and reliability of active I/O storage systems . . . . . . . . . . 163--178
Seung Woo Son and Konrad Malkowski and Guilin Chen and Mahmut Kandemir and Padma Raghavan Reducing energy consumption of parallel sparse matrix applications through integrated link/CPU voltage scaling . . 179--213 F. Safaei and A. Khonsari and M. Fathy and M. Ould-Khaoua Performance analysis of fault-tolerant routing algorithm in wormhole-switched interconnections . . . . . . . . . . . . 215--245 Wen-Chung Shih and Chao-Tung Yang and Shian-Shyong Tseng A performance-based parallel loop scheduling on Grid environments . . . . 247--267 N. Imani and H. Sarbazi-Azad and S. G. Akl Perfect load balancing on the star interconnection network . . . . . . . . 269--286
George A. Gravvanis and John P. Morrison and Geoffrey C. Fox Special issue on ``Grid Technologies'' 1--2 Heinz Stockinger Defining the Grid: a snapshot on the current view . . . . . . . . . . . . . . 3--17 Isaac D. Scherson and Enrique Cauich and Daniel S. Valencia Service discovery for GRID computing using LCAN-mapped hierarchical directories . . . . . . . . . . . . . . 19--32 Ashish M. Mehta and Jay Smith and H. J. Siegel and Anthony A. Maciejewski and Arun Jayaseelan and Bin Ye Dynamic resource allocation heuristics that manage tradeoff between makespan and robustness . . . . . . . . . . . . . 33--58 Pieter Thysebaert and Bruno Volckaert and Marc De Leenheer and Filip De Turck and Bart Dhoedt and Piet Demeester Dimensioning and on-line scheduling in Lambda Grids using divisible load concepts . . . . . . . . . . . . . . . . 59--82 Mark A. Baker and Matthew Grove Tycho: a wide-area messaging framework with an integrated virtual registry . . 83--106 Ralph H. Castain and Jeffrey M. Squyres Creating a transparent, distributed, and resilient computing environment: the OpenRTE project . . . . . . . . . . . . 107--123 Dawid Kurzyniec and Magdalena Slawi\'nska and Jaros\law Slawi\'nski and Vaidy Sunderam Unibus: a contrarian approach to Grid computing . . . . . . . . . . . . . . . 125--144
K. M. Giannoutakis and G. A. Gravvanis and B. Clayton and A. Patil and T. Enright and J. P. Morrison Matching high performance approximate inverse preconditioning to architectural platforms . . . . . . . . . . . . . . . 145--163 Charalampos Konstantopoulos and Christos Kaklamanis Optimal hypercube simulation on the partitioned optical passive stars network . . . . . . . . . . . . . . . . 165--180 Chuliang Weng and Minglu Li and Xinda Lu Grid resource management based on economic mechanisms . . . . . . . . . . 181--199 Yung-Chia Lin and Yi-Ping You and Chung-Wen Huang and Jenq Kuen Lee and Wei-Kuan Shih and Ting-Ting Hwang Energy-aware scheduling and simulation methodologies for parallel security processors with multiple voltage domains 201--223 Sun-Yuan Hsieh and Tsong-Jie Lin and Hui-Ling Huang Panconnectivity and edge-pancyclicity of $3$-ary $N$-cubes . . . . . . . . . . . 225--233 Ahmed A. El Farag and Hatem M. El-Boghdadi and Samir I. Shaheen Improving utilization of reconfigurable resources using two-dimensional compaction . . . . . . . . . . . . . . . 235--250
Kuan-Ching Li and Yong-Kee Jun Foreword from Guest Editors . . . . . . 251--252 Jesus Luna and Manel Medina and Oscar Manso Using OGRO and CertiVeR to improve OCSP validation for Grids . . . . . . . . . . 253--266 Chien-Min Wang and Chun-Chen Hsu and Pangfeng Liu and Hsi-Min Chen and Jan-Jan Wu Optimizing server placement in hierarchical Grid environments . . . . . 267--282 Pangfeng Liu and Yi-Fang Lin and Jan-Jan Wu and Zhe-Hao Kang An optimal scheduling algorithm for an agent-based multicast strategy on irregular networks . . . . . . . . . . . 283--302 Shibin Qiu and Terran Lane and Cundong Yang Efficient search algorithms for RNAi target detection . . . . . . . . . . . . 303--319 Dong-Suk Hong and Hong-Koo Kang and Dong-Oh Kim and Jae-Kwan Yun and Ki-Joon Han A Grid-based node split algorithm for managing current location data of moving objects . . . . . . . . . . . . . . . . 321--337
David Vengerov A reinforcement learning framework for online data migration in hierarchical storage systems . . . . . . . . . . . . 1--19 Stéphane Genaud and Pierre Gançarski and Guillaume Latu and Alexandre Blansché and Choopan Rattanapoka and Damien Vouriot Exploitation of a parallel clustering algorithm on commodity hardware with P2P-MPI . . . . . . . . . . . . . . . . 21--41 Jin Hwan Park and H. K. Dai Reconfigurable hardware solution to parallel prefix computation . . . . . . 43--58 Yang Liu and Yu-Kwong Kwok and J. Wang High performance power control and opportunistic fair scheduling in TH-PPM UWB ad-hoc multimedia networks . . . . . 59--76 Shiyuan Jin and Guy Schiavone and Damla Turgut A performance study of multiprocessor task scheduling algorithms . . . . . . . 77--97 Hazem M. Bahig Parallel merging with restriction . . . 99--104
Minyi Guo and Jingling Xue Advances in high performance computing 105--106 Ming Liu and Jiannong Cao and Yuan Zheng and Haigang Gong and Xiaomin Wang An energy-efficient protocol for data gathering and aggregation in wireless sensor networks . . . . . . . . . . . . 107--125 Zhiyuan Shao and Hai Jin and Bin Cheng and Wenbin Jiang ER-TCP: an efficient TCP fault-tolerance scheme for cluster computing . . . . . . 127--145 Jingling Xue and Minyi Guo and Daming Wei Improving the parallelism of iterative methods by aggressive loop fusion . . . 147--164 Sang Boem Lim and Hanku Lee and Bryan Carpenter and Geoffrey Fox Runtime support for scalable programming in Java . . . . . . . . . . . . . . . . 165--182 Guihai Chen and Tongqing Qiu and Fan Wu Insight into redundancy schemes in DHTs 183--198
Dejiang Jin and Sotirios G. Ziavras Robust scalability analysis and SPM case studies . . . . . . . . . . . . . . . . 199--223 Chen Xu-Dong and Zhu Qing-Xin and Liao Yong and Xiong Guang Ze End-to-end deadline control for aperiodic tasks in distributed real-time systems . . . . . . . . . . . . . . . . 225--240 Fernando J. Barros Modeling and simulation of parallel adaptive divide-and-conquer algorithms 241--255 Abu Zafar M. Shahriar and M. Mostofa Akbar and M. Sohel Rahman and Muhammad Abdul Hakim Newton A multiprocessor based heuristic for multi-dimensional multiple-choice knapsack problem . . . . . . . . . . . . 257--280 Robert W. Numrich A metric space for computer programs and the principle of computational least action . . . . . . . . . . . . . . . . . 281--298
Chao-Tung Yang and Wen-Chung Shih and Shian-Shyong Tseng Dynamic partitioning of loop iterations on heterogeneous PC clusters . . . . . . 1--23 Mohsen Sharifi and Behrouz Zolfaghari YAARC: yet another approach to further reducing the rate of conflict misses . . 24--40 Ricolindo L. Cariño and Ioana Banicescu Dynamic load balancing with adaptive factoring methods in scientific applications . . . . . . . . . . . . . . 41--63 Evangelia Athanasaki and Nikos Anastopoulos and Kornilios Kourtis and Nectarios Koziris Exploring the performance limits of simultaneous multithreading for memory intensive applications . . . . . . . . . 64--97 Saralees Nadarajah Comment on ``A general model for long-tailed network traffic approximation'' . . . . . . . . . . . . 98--101
Md. Musfiquzzaman Akanda and Ben A. Abderazek and Masahiro Sowa Dual-execution mode processor architecture . . . . . . . . . . . . . . 103--125 Kuan-Chou Lai and Chao-Tung Yang A dominant predecessor duplication scheduling algorithm for heterogeneous systems . . . . . . . . . . . . . . . . 126--145 Rebecca J. Cathey and Steven M. Beitzel and Eric C. Jensen and David Grossman and Ophir Frieder Using a relational database for scalable XML search . . . . . . . . . . . . . . . 146--178 James P. Sweeney and Sanjay P. Ahuja Heuristic solutions to resource allocation in Grid computing: a natural approach . . . . . . . . . . . . . . . . 179--198
Yang Liu and Yu-Kwong Kwok and Jiangzhou Wang Downlink TCP performance under cross layer rate and power allocation in infrastructure TH-PPM UWB networks . . . 199--216 Lu Liu and Nick Antonopoulos and Stephen Mackin Managing peer-to-peer networks with human tactics in social interactions . . 217--236 Miguel O. Bernabeu and Pedro Alonso and Antonio M. Vidal A multilevel parallel algorithm to solve symmetric Toeplitz linear systems . . . 237--256 Majid Sabzehparvar and S. Mohammad Seyed-Hosseini A mathematical model for the multi-mode resource-constrained project scheduling problem with mode dependent time lags 257--273 Eunice E. Santos and Jeffrey M. Rickman and Gayathri Muthukrishnan and Shuangtong Feng Efficient algorithms for parallelizing Monte Carlo simulations for $2$D Ising spin models . . . . . . . . . . . . . . 274--290
Chung-Sheng Li and Yung-Chih Tseng and Han-Chieh Chao and Yueh-Min Huang A neighbor caching mechanism for handoff in IEEE 802.11 wireless networks . . . . 1--14 Paolo Bellavista and Antonio Corradi and Luca Foschini Dynamic and context-aware streaming adaptation to smooth quality degradation due to IEEE 802.11 performance anomaly 15--28 Jeong-Wook Seo and Sang-Soo Yeo and Kyoo Seok Park and Hang Bae Chang and Sung-Eon Cho and Dong-Ku Kim A robust and secure time-domain ICI canceller for OFDM based ubiquitous systems in time-varying multipath channels . . . . . . . . . . . . . . . . 29--43 Firdous Kausar and Sajid Hussain and Laurence T. Yang and Ashraf Masood Scalable and efficient key management for heterogeneous sensor networks . . . 44--65 Haiyan Li and Chunyuan Zhang and Li Li and Ju Ren Transform coding on programmable stream processors . . . . . . . . . . . . . . . 66--87 Deok Gyu Lee and Jong Hyuk Park and Tai-Hoon Kim and Laurence T. Yang U-multimedia framework: a secure and intelligent multimedia service framework based on context information in U-home 88--104 Hongju Cheng and Naixue Xiong and Larence T. Yang and Young-Sik Jeong Distributed scheduling algorithms for channel access in TDMA wireless mesh networks . . . . . . . . . . . . . . . . 105--128 Ching-Hsien Hsu and Tai-Lung Chen and Jong-Hyuk Park On improving resource utilization and system throughput of master slave job scheduling in heterogeneous systems . . 129--150
Jih-Woei Huang and Chih-Ping Chu A flexible processor mapping technique toward data localization for block-cyclic data redistribution . . . . 151--172 Qiang Zhu On conditional diagnosability and reliability of the BC networks . . . . . 173--184 Imran Rao and Eui-Nam Huh A probabilistic and adaptive scheduling algorithm using system-generated predictions for inter-Grid resource sharing . . . . . . . . . . . . . . . . 185--204 Akshat Verma and Sandeep Sen Combating I-O bottleneck using prefetching: model, algorithms, and ramifications . . . . . . . . . . . . . 205--235 Jing Zhou and Guosun Zeng A general data dependence analysis for parallelizing compilers . . . . . . . . 236--252 Jing Zhou and Guosun Zeng A general data dependence analysis for parallelizing compilers . . . . . . . . 253--253
A. Kolcz and A. Chowdhury Lexicon randomization for near-duplicate detection with I-Match . . . . . . . . . 255--276 Ruay-Shiung Chang and Hui-Ping Chang A dynamic data replication strategy using access-weights in data grids . . . 277--295 Junfeng Wang and Jin Liu and Chundong She Segment-based adaptive hyper-Erlang model for long-tailed network traffic approximation . . . . . . . . . . . . . 296--312 Hamid Noori and Farhad Mehdipour and Kazuaki Murakami and Koji Inoue and Morteza Saheb Zamani An architecture framework for an adaptive extensible processor . . . . . 313--340 Antonio Flores and Juan L. Aragón and Manuel E. Acacio An energy consumption characterization of on-chip interconnection networks for tiled CMP architectures . . . . . . . . 341--364
Aminollah Mahabadi and Hamid Sarbazi-Azad and Ebrahim Khodaie and Keivan Navi Parallel Lagrange interpolation on $k$-ary $n$-cubes with maximum channel utilization . . . . . . . . . . . . . . 1--14 Mohammad Kalantari and Mohammad Kazem Akbari Fault-aware grid scheduling using performance prediction by workload modeling . . . . . . . . . . . . . . . . 15--39 Stavros Souravlas and Manos Roumeliotis A message passing strategy for array redistributions in a torus network . . . 40--57 Yang Liu and Yu-Kwong Kwok and Jiangzhou Wang On scheduling and clustering in hierarchical TH-PPM UWB wireless ad hoc networks . . . . . . . . . . . . . . . . 58--83 José R. Herrero and Juan J. Navarro Hypermatrix oriented supernode amalgamation . . . . . . . . . . . . . . 84--104
Steve C. Chiu Current issues in high performance computing I/O architectures and systems 105--107 John A. Chandy RAID0.5: design and implementation of a low cost disk array data protection method . . . . . . . . . . . . . . . . . 108--123 Yaohang Li and Yong-Duan Song An adaptive and trustworthy software testing framework on the grid . . . . . 124--138 Danli Wang and Guozhong Dai and Hongan Wang and Steve C. Chiu Scenario-focused development method for a pen-based user interface: model and applications . . . . . . . . . . . . . . 139--149 Rajagopal Subramaniyan and Eric Grobelny and Scott Studham and Alan D. George Optimization of checkpointing-related I/O for high-performance parallel and distributed computing . . . . . . . . . 150--180
Fatima K. Abu Salem and Laurence T. Yang Parallel methods for absolute irreducibility testing . . . . . . . . . 181--212 Florin Isaila and Walter F. Tichy Mapping functions and data redistribution for parallel files . . . 213--236 Bin Xiao and Jiannong Cao and Qin Lu Dynamic SPT update for multiple link state decrements in network routing . . 237--256 Eun-Ha Song and Laurence T. Yang and Young-Sik Jeong 3-Hierarchical resource management model on web grid service architecture . . . . 257--275 Basel A. Mahafzah and Bashira A. Jaradat The load balancing problem in OTIS-Hypercube interconnection networks 276--297
Hui Li Workload dynamics on clusters and grids 1--20 Kenneth L. Rice and Tarek M. Taha and Christopher N. Vutsinas Scaling analysis of a neocortex inspired cognitive model on the Cray XD1 . . . . 21--43 George A. Gravvanis OpenMP based parallel normalized direct methods for sparse finite element linear systems . . . . . . . . . . . . . . . . 44--52 David E. Singh and Florin Isaila and Juan C. Pichel and Jesús Carretero A collective I/O implementation based on inspector-executor paradigm . . . . . . 53--75 Chao-Tung Yang and Kuan-Chou Lai and Po-Chi Shih Design and implementation of a workflow-based resource broker with information system on computational grids . . . . . . . . . . . . . . . . . 76--109
Hsiao-Hsi Wang and Kuan-Ching Li and Ssu-Hsuan Lu and Chun-Chieh Yang Towards implementation of a novel scheme for data prefetching on distributed shared memory systems . . . . . . . . . 111--126 Guojun Wang and Tian Wang and Weijia Jia and Minyi Guo and Jie Li Adaptive location updates for mobile sinks in wireless sensor networks . . . 127--145 Kiran Kola and Ruppa K. Thulasiram and Parimala Thulasiraman A software architecture framework for on-line option pricing . . . . . . . . . 146--170 Xuejun Yang and Jing Du and Xiaobo Yan and Yu Deng Matrix-based streamization approach for improving locality and parallelism on FT64 stream processor . . . . . . . . . 171--197 Banani Roy and Michael Einhaus and Chanchal Kumar Roy A framework for development and evaluation of a dynamic subchannel allocation scheme in an OFDMA system . . 198--227 Satoshi Amamiya and Makoto Amamiya and Ryuzo Hasegawa and Hiroshi Fujita A continuation-based noninterruptible multithreading processor architecture 228--252
Jesus Carretero and Jose Daniel Garcia Scalability in data management . . . . . 253--254 Munehiro Fukuda and Jumpei Miyauchi An implementation of parallel file distribution in an agent hierarchy . . . 255--285 Jimmy Secretan and Malachi Lawson and Ladislau Bölöni Efficient allocation and composition of distributed storage . . . . . . . . . . 286--310 David E. Singh and Florin Isaila and Juan C. Pichel and Jesús Carretero Erratum: ``A collective I/O implementation based on inspector-executor paradigm'' . . . . . 311--311 A. Calderón and F. García-Carballeira and L. M. Sánchez and J. D. García and J. Fernandez Fault tolerant file models for parallel file systems: introducing distribution patterns for every file . . . . . . . . 312--334
Junjun Deng and Hengyong Yu and Jun Ni and Lihe Wang and Ge Wang Parallelism of iterative CT reconstruction based on local reconstruction algorithm . . . . . . . . 1--14 Tongqing Qiu and Edward Chan and Mao Ye and Guihai Chen and Ben Y. Zhao Peer-exchange schemes to handle mismatch in peer-to-peer systems . . . . . . . . 15--42 Kuan-Ching Li and Tien-Hsiung Weng Performance-based parallel application toolkit for high-performance clusters 43--65 Chung-Hao Chang and Cheng-Kuan Lin and Jimmy J. M. Tan and Hua-Min Huang and Lih-Hsing Hsu The super spanning connectivity and super spanning laceability of the enhanced hypercubes . . . . . . . . . . 66--87 Jose Ignacio Aliaga and Francisco Almeida and Jose Manuel Badía and Sergio Barrachina and Vicente Blanco and others Toward the parallelization of GSL . . . 88--114
Grigorios Dimitroulakos and Nikos Kostaras and Michalis D. Galanis and Costas E. Goutis Compiler assisted architectural exploration framework for coarse grained reconfigurable arrays . . . . . . . . . 115--151 Zhao Chen Mapping computations . . . . . . . . . . 152--162 Mario E. Caire and Francisco J. Lopez and David H. Williams Distributed identification of the lineality space of a cone . . . . . . . 163--182 Brian J. dÁuriol The systems edge of the Parameterized Linear Array with a Reconfigurable Pipelined Bus System (LARPBS(p)) optical bus parallel computing model . . . . . . 183--209 Kuan-Wei Cheng and Tzong-Yen Lin and Rong-Guey Chang Profile-based dynamic pipeline scaling 210--226
Yang Xiang Managing email overload with an automatic nonparametric clustering system . . . . . . . . . . . . . . . . . 227--242 Chuang Lin and Xin Jiang and Hao Yin and Yuanzhuo Wang and Yada Hu and others Optimizing availability and QoS of heterogeneous distributed system based on residual lifetime in uncertain environment . . . . . . . . . . . . . . 243--263 Jie Xu and Hai Jin A structured P2P network based on the small world phenomenon . . . . . . . . . 264--285 Charalampos Konstantopoulos and Basilis Mamalis and Grammati Pantziou and Damianos Gavalas Efficient parallel text retrieval techniques on Bulk Synchronous Parallel (BSP)/Coarse Grained Multicomputers (CGM) . . . . . . . . . . . . . . . . . 286--318 Sha Meng and Xin Kai Li and Ronaldo Mercado Smooth interfaces for spectral element method for the solution of incompressible Newtonian fluid flow . . 319--331
Christophe Cérin and Jean-Luc Gaudiot and Kuan-Ching Li Special issue of Supercomputing Journal on secure, manageable and controllable grid services . . . . . . . . . . . . . 1--3 Yuhui Deng and Frank Wang and Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems 4--21 Rodrigo F. de Mello and Laurence T. Yang Prediction of dynamical, nonlinear, and unstable process behavior . . . . . . . 22--41 Chieh-Wen Cheng and Jan-Jan Wu and Pangfeng Liu QoS-aware, access-efficient, and storage-efficient replica placement in grid environments . . . . . . . . . . . 42--63 Aurélie Hurault and Michel Daydé and Marc Pantel Advanced service trading for scientific computing over the grid . . . . . . . . 64--83 Julien Sopena and Luciana Arantes and Fabrice Legond-Aubry and Pierre Sens Building effective mutual exclusion services for grids . . . . . . . . . . . 84--107 P. Mazzoleni and B. Crispo and S. Sivasubramanian and E. Bertino Efficient integration of fine-grained access control and resource brokering in grid . . . . . . . . . . . . . . . . . . 108--126 Dan Liu and Yuan-Da Cao and Chun-Qing Li Liana: a decentralized load-dependent scheduler for performance-cost optimization of grid service . . . . . . 127--156 Anonymous Reviewers . . . . . . . . . . . . . . . 157--157
Deng Li and Zhi-Gang Chen and Hui Liu and Athanasios V. Vasilakos and Yi Pan IPBGA: a hybrid P2P based grid architecture by using information pool protocol . . . . . . . . . . . . . . . . 159--189 Chunlin Li and Layuan Li Hierarchical control policy for dynamic resource management in grid virtual organization . . . . . . . . . . . . . . 190--218 Jung-Sheng Fu and Hao-Shun Hung and Gen-Huey Chen Embedding fault-free cycles in crossed cubes with conditional link faults . . . 219--233 Hsun-Jung Cho and Chien-Lun Lan Hybrid shortest path algorithm for vehicle navigation . . . . . . . . . . . 234--247 Mary M. Eshaghian-Wilner and Shiva Navab Efficient parallel processing with spin-wave nanoarchitectures . . . . . . 248--267
E. I. Milovanovi\'c and M. P. Bekakos and I. Z. Milovanovi\'c Synthesis of space optimal systolic arrays for band matrix-vector multiplication . . . . . . . . . . . . . 269--290 Abderrahmane Sider and Raphaël Couturier Fast load balancing with the most to least loaded policy in dynamic networks 291--317 Yan-Qing Zhang and Yi Pan Incomplete crossed hypercubes . . . . . 318--333 Ayaz Isazadeh and Jaber Karimpour A new formalism for mathematical description and verification of component-based systems . . . . . . . . 334--353 Sun-Yuan Hsieh and Chang-De Wu Optimal fault-tolerant Hamiltonicity of star graphs with conditional edge faults 354--372 Gavriel Yarmish and Richard Van Slyke A distributed, scaleable simplex method 373--381
Babar Nazir and Kalim Qureshi and Paul Manuel Adaptive checkpointing strategy to tolerate faults in economy based grid 1--18 L. Mohammad Khanli and M. Analoui Active Grid Information Server for grid computing . . . . . . . . . . . . . . . 19--35 Georgios Goumas and Kornilios Kourtis and Nikos Anastopoulos and Vasileios Karakasis and Nectarios Koziris Performance evaluation of the sparse matrix-vector multiplication on modern architectures . . . . . . . . . . . . . 36--77 Yu-Te Lin and Peng-Sheng Chen Compiler support for general-purpose computation on GPUs . . . . . . . . . . 78--97
Yong-Ju Lee and Yoo-Hyun Park and Song-Woo Sok and Hag-Young Kim and Cheol-Hoon Lee Fast-path I/O architecture for high performance streaming server . . . . . . 99--120 José Ramón González de Mendívil and José Enrique Armendáriz-Iñigo and José Ramón Garitagoitia and Francesc D. Muñoz-Escoí A formal analysis of database replication protocols with SI replicas and crash failures . . . . . . . . . . . 121--161 Chien-Hung Huang and Jywe-Fei Fang The panconnectivity and the pancycle-connectivity of the generalized base- b hypercube . . . . . . . . . . . 162--176 Chao-Tung Yang and Kuan-Chou Lai A directive-based MPI code generator for Linux PC clusters . . . . . . . . . . . 177--207
John Paul Walters and Vipin Chaudhary A fault-tolerant strategy for virtualized HPC clusters . . . . . . . . 209--239 Ching-Wen Chen and Chuan-Chi Weng Bandwidth-based routing protocols in mobile ad hoc networks . . . . . . . . . 240--268 Ching-Hsien Hsu and Bing-Ru Tsai Scheduling for atomic broadcast operation in heterogeneous networks with one port model . . . . . . . . . . . . . 269--288 Shahriar Lotfi and Saeed Parsa Parallel loop generation and scheduling 289--306