参考文献
[1]Diefendorff K,Dubey P K,Hochsprung R,et al.AltiVec Extension to PowerPC Accelerates Media Processing[J].IEEE Micro,2000,20(2):85-95.
[2]Boggs D,Baktha A,Hawkins J,et al.The Microarchitecture of the Intel Pentium 4 Processor on 90 nm Technology[J].Intel Technology Journal,2004,8(1):7-23.
[3]Singh J P,Gupta A,Ohara M,et al.The SPLASH-2 Programs:Characterization and Methodological Consider-ations[C]//Proceedings of the 22nd Annual International Symposium on Computer Architecture.New York,USA:ACM Press,1995:24-36.
[4]Sweetman D.See MIPS Run[M].San Francisco,USA:Morgan Kaufmann Publishers Inc.,2006.
[5]Sites R L.Alpha Architecture Reference Manual[M].[S.l.]:Digital Press,1992.
[6]Nuzman D,Henderson R.Multi-platform Auto-vectori-zation[C]//Proceedings of International Symposium on Code Generation and Optimization.Washington D.C.,USA:IEEE Computer Society,2006:281-294.
[7]Eichenberger A E,Wu Peng,O’Brien K.Vectorization for SIMD Architectures with Alignment Constraints[C]//Proceedings of ACM SIGPLAN Conference on Programm-ing Languages Design and Implementation.New York,USA:ACM Press,2004:82-93.
[8]Wu Peng,Eichenberger A E,Wang A.Efficient SIMD Code Generation for Runtime Alignment and Length Con-version[C]//Proceedings of the International Symposium on Code Generation and Optimization.Washington D.C.,USA:IEEE Computer
Society,2005:153-164.
[9]Larsen S,Witchel E,Amarasinghe S P.Increasing and Detecting Memory Address Congruence[C]//Proceedings of the 11th International Conference on Parallel Architec-tures and Compilation Techniques.Washington D.C.,USA:IEEE Computer
Society,2002:18-29.
[10]Shahbahrami A,Juurlink B,Vassiliadis S.Performance Impact of Misaligned Accesses in SIMD Extensions[C]//Proceedings of the 17th Annual Workshop on Circuits,Systems and Signal Processing.Washington D.C.,USA:IEEE Presss,2006:23-24.
[11]李玉祥,施慧,陈莉.面向非多媒体程序的SIMD向量化算法的研究及改进[J].小型微型计算机系统,2009,30(10):1927-1935.
[12]Zhang K X.Buffer for a Split Cache Line Access:US6862225[P].2005-03-01.
[13]Alvarez M,Salami E,Ramirez A,et al.Performance Impact of Unaligned Memory Operations in SIMD Extensions for Video Codec Applications[C]//Proceedings of IEEE Inter-national Symposium on Performance Analysis of Systems &
Software.Washington D.C.,USA:IEEE Press,2007:62-71.
[14]Bik A J C,Girkar M,Grey P M,et al.Automatic Intra-register Vectorization for the Intel Architecture[J].International Journal of Parallel Programming,2002,30(2):65-98.
[15]Binkert N,Beckmann B,Black G,et al.The Gem5 Simulator[J].ACM SIGARCH Computer Architecture News,2011,39(2):1-7.
编辑陆燕菲 |