什么是半月板损伤| 脾胃不好有什么症状表现| z是什么品牌| 容易头晕是什么原因| mirror什么意思| 中指和无名指发麻是什么原因| 70年出生属什么生肖| cba是什么意思| 手麻脚麻是什么原因| 卵巢囊肿是什么引起的| 夏天怕热冬天怕冷是什么体质| 含锶矿泉水有什么好处| 至死不渝什么意思| 大小脸挂什么科| 癌症病人吃什么| 2.6号是什么星座| 你算什么男人歌词| 放臭屁吃什么药| 心脏积液吃什么药最好| 紫微斗数是什么| 岱字五行属什么| 干细胞移植是什么意思| 试婚是什么意思啊| 兰州人为什么要戴头巾| 爆冷是什么意思| dpm值是什么意思| 什么的口水| 跳蚤的天敌是什么| 中央民族大学什么档次| 浮生如梦是什么意思| 卵黄囊偏大是什么原因| 心灵鸡汤是什么意思| 男性尿血是什么原因导致的| 急腹症是什么意思| 肌肉痉挛用什么药能治好| 红眼病用什么眼药水| 妊娠纹长什么样| 血脂高什么意思| 全麦粉和小麦粉的区别是什么| 肾小球肾炎吃什么药| 寄托是什么意思| 背疼什么原因| 斯德哥尔摩综合症是什么| 一什么天空| 码是什么单位| 老鸨是什么意思| 澍在人名中读什么| 10.30是什么星座| 胰腺管扩张是什么原因| 过敏性紫癜有什么危害| 堂号是什么意思| 狗什么东西不能吃| 冗长是什么意思| 毛骨悚然是什么意思| 黄桃不能和什么一起吃| 血清铁蛋白低说明什么| 颅内出血有什么症状| 射手座属于什么星象| 什么时候去西藏旅游最好| 光年是什么单位| 脖子后面长痘痘是什么原因| 欲代表什么生肖| 坐月子能吃什么蔬菜| 此地无银三百两什么意思| 奶水不足吃什么下奶最快| 蛋白粉什么时候吃| 1946属什么生肖| 七夕节是什么时候| 什么血型是熊猫血| 双下肢水肿是什么原因| 过敏性鼻炎吃什么药| 氩气是什么气体| 狗狗流眼泪是什么原因| 老心慌是什么原因| 犹太人割礼是什么意思| 俯卧撑有什么好处| 眼轴是什么意思| 有什么办法可以怀孕| 什么而去| 心慌心跳吃什么药| 北京大栅栏有什么好玩的| mw是什么单位| 女汉子什么意思| 黄墙绿地的作用是什么| 为什么总是被蚊子咬| 乳酸堆积是什么意思| 双鱼座的幸运色是什么颜色| 感冒看什么科| 真菌镜检阴性是什么意思| 梦到前任预示着什么| 吃什么药可以延长性功能| 左肾小囊肿是什么意思| ppa是什么| 补充胶原蛋白吃什么最好| 茯苓什么味道| 96是什么意思| etf是什么意思| 贡生相当于现在的什么| 常温保存是什么意思| NPY什么意思| 不置可否什么意思| 谷丙转氨酶偏高吃什么好| 嘴苦嘴臭什么原因| 不务正业是什么意思| 西岳什么山| 督邮相当于现在什么官| 最好的假牙是什么材质| 病假需要什么医院证明| 额头上长痘是因为什么| 正月初八是什么星座| 什么的列车| 胃动力不足吃什么中成药| 诸葛亮字什么| 4月3日什么星座| 什么叫钝角| 打下巴用什么玻尿酸最好| 肠癌是什么原因造成的| 人参不能和什么一起吃| 盆腔炎检查什么项目| 容易早醒是什么原因| 无语凝噎是什么意思| 手指甲上的月牙代表什么| 音什么笑什么成语| 四月初八是什么节日| 女人脾胃虚弱吃什么好| 灵芝有什么好处| gummy是什么意思| 免疫力差吃什么可以增强抵抗力| 千岛酱是什么味道| 摸头杀是什么意思| 铁皮石斛花有什么作用| 午餐肉是什么肉| as什么意思| 奕五行属什么| 妨子痣是什么意思| 黄曲霉菌是什么颜色| 低级别上皮内瘤变是什么意思| 高血脂吃什么食物最好| 肾结石去医院挂什么科| 梦见自己鞋子破了是什么意思| 解表是什么意思| 人参果什么季节成熟| 俄罗斯人是什么人种| 胃窦病变意味着什么| 五粮液是什么香型的酒| 怠工是什么意思| 宫内妊娠是什么意思| 人到无求品自高什么意思| 口腔溃疡反反复复是什么原因| 低压低吃什么药| 什么是腔梗| 晨起口干口苦是什么原因| 什么叫试管婴儿| 虚妄是什么意思| 女性长期缺维d会带来什么病| 太平天国为什么会失败| 孕妇应该多吃什么水果| 广东第一峰叫什么山| 柠檬茶喝了有什么好处| 乌药别名叫什么| 牛头人什么意思| gucci中文叫什么牌子| 什么鸟不能飞| 嘴唇出血是什么原因| 吃香蕉有什么好处| 女人吃什么补气血效果最好| 鼻子旁边的痣代表什么| 白参是什么参| 嘴唇发紫黑是什么原因| 拉肚子吃什么消炎药| ntl是什么意思| alt医学上是什么意思| 什么是肺部磨玻璃结节| 男人喝藏红花有什么好处| 太平天国失败的根本原因是什么| 88属什么生肖| 间接胆红素高说明什么| 鸡肉和什么菜搭配最好| 蝉是什么生肖| 什么的雷雨| 天上的彩虹像什么| 花旦是什么意思| 为什么怀孕前三个月不能说| 臭鼬是什么动物| 口腔挂什么科| 脑梗要注意什么| 中产家庭的标准是什么| 4个火读什么| 夜晚睡不着觉什么原因| 胆囊结石用什么药好| 尿酸高是什么病| 冲任失调是什么意思| 蚕蛹是什么| 梦见小兔子是什么意思| 咽喉炎吃什么好| 5月4日是什么星座| 肺热咳嗽吃什么药| 喉咙痒咳嗽吃什么药好| 房性早搏是什么意思| 梦见门坏了什么意思| mpr是什么意思| 孕妇梦见龙是什么征兆| 6月份种什么菜| 每天都做梦是什么原因| 猫便秘吃什么最快排便| 荞麦枕头有什么好处| 长期喝什么水可以美白| 八爪鱼是什么意思| 切口憩室是什么意思| 淋巴结肿大是什么样子| experiment什么意思| 上眼皮肿是什么原因| coo是什么| 心脏增大吃什么药| 猪蹄炖什么| 诞生是什么意思| 多吃玉米有什么好处和坏处| 看脱发应该挂什么科| 村书记是什么级别| 月经有黑色血块是什么原因| 什么是表达方式| 89岁属什么生肖| 中暑了吃什么| 服兵役是什么意思| 五谷丰登是什么生肖| 幽门螺杆菌挂什么科| 容易被吓到是什么原因| 05年属鸡的是什么命| 过敏性皮肤用什么护肤品比较好| 慧根是什么意思| 幻听是什么症状| 睡觉磨牙齿是什么原因| 一级军士长什么待遇| 毛血旺是什么| 感冒头晕是什么原因| 海带炖什么好吃| 不长毛的猫叫什么名字| 间接胆红素高是什么意思| 油价什么时候调整| 属马是什么命| 12月26是什么星座| 八年是什么婚| 别来无恙什么意思| dsd是什么意思| 造势是什么意思| ct和拍片有什么区别| 聤耳是什么意思| 什么是喜欢什么是爱| 硫酸羟氯喹片是治什么病| 农业户口和非农业户口有什么区别| 接风吃什么| mcn是什么意思| 心电图窦性心律不齐是什么意思| 碘缺乏会导致什么疾病| 胃酸吃什么可以缓解| 思维跳脱是什么意思| 甲状腺结节是什么原因引起的| 肛门胀痛什么原因| 唐卡是什么材料做的| 子宫癌筛查做什么检查| 糖类抗原ca199偏高是什么原因| 今年27岁属什么生肖| 肌酸是什么东西| 吃马齿苋有什么好处| 百度

纪录频道《记住乡愁 第二季》

百度 如本网转载稿涉及版权等问题,请著作权人来电、来函与中国汽车报网联系。

In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it.[1]

A graphical demo running as a benchmark of the OGRE engine

The term benchmark is also commonly utilized for the purposes of elaborately designed benchmarking programs themselves.

Benchmarking is usually associated with assessing performance characteristics of computer hardware, for example, the floating point operation performance of a CPU, but there are circumstances when the technique is also applicable to software. Software benchmarks are, for example, run against compilers or database management systems (DBMS).

Benchmarks provide a method of comparing the performance of various subsystems across different chip/system architectures. Benchmarking as a part of continuous integration is called Continuous Benchmarking.[2]

Purpose

edit

As computer architecture advanced, it became more difficult to compare the performance of various computer systems simply by looking at their specifications. Therefore, tests were developed that allowed comparison of different architectures. For example, Pentium 4 processors generally operated at a higher clock frequency than Athlon XP or PowerPC processors, which did not necessarily translate to more computational power; a processor with a slower clock frequency might perform as well as or even better than a processor operating at a higher frequency. See BogoMips and the megahertz myth.

Benchmarks are designed to mimic a particular type of workload on a component or system. Synthetic benchmarks do this by specially created programs that impose the workload on the component. Application benchmarks run real-world programs on the system. While application benchmarks usually give a much better measure of real-world performance on a given system, synthetic benchmarks are useful for testing individual components, like a hard disk or networking device.

Benchmarks are particularly important in CPU design, giving processor architects the ability to measure and make tradeoffs in microarchitectural decisions. For example, if a benchmark extracts the key algorithms of an application, it will contain the performance-sensitive aspects of that application. Running this much smaller snippet on a cycle-accurate simulator can give clues on how to improve performance.

Prior to 2000, computer and microprocessor architects used SPEC to do this, although SPEC's Unix-based benchmarks were quite lengthy and thus unwieldy to use intact.

Computer companies are known to configure their systems to give unrealistically high performance on benchmark tests that are not replicated in real usage. For instance, during the 1980s some compilers could detect a specific mathematical operation used in a well-known floating-point benchmark and replace the operation with a faster mathematically equivalent operation. However, such a transformation was rarely useful outside the benchmark until the mid-1990s, when RISC and VLIW architectures emphasized the importance of compiler technology as it related to performance. Benchmarks are now regularly used by compiler companies to improve not only their own benchmark scores, but real application performance.

CPUs that have many execution units — such as a superscalar CPU, a VLIW CPU, or a reconfigurable computing CPU — typically have slower clock rates than a sequential CPU with one or two execution units when built from transistors that are just as fast. Nevertheless, CPUs with many execution units often complete real-world and benchmark tasks in less time than the supposedly faster high-clock-rate CPU.

Given the large number of benchmarks available, a vendor can usually find at least one benchmark that shows its system will outperform another system; the other systems can be shown to excel with a different benchmark.

Software vendors also use benchmarks in their marketing, such as the "benchmark wars" between rival relational database makers in the 1980s and 1990s. Companies commonly report only those benchmarks (or aspects of benchmarks) that show their products in the best light. They also have been known to mis-represent the significance of benchmarks, again to show their products in the best possible light.[3][4]

Ideally benchmarks should only substitute for real applications if the application is unavailable, or too difficult or costly to port to a specific processor or computer system. If performance is critical, the only benchmark that matters is the target environment's application suite.

Functionality

edit

Features of benchmarking software may include recording/exporting the course of performance to a spreadsheet file, visualization such as drawing line graphs or color-coded tiles, and pausing the process to be able to resume without having to start over. Software can have additional features specific to its purpose, for example, disk benchmarking software may be able to optionally start measuring the disk speed within a specified range of the disk rather than the full disk, measure random access reading speed and latency, have a "quick scan" feature which measures the speed through samples of specified intervals and sizes, and allow specifying a data block size, meaning the number of requested bytes per read request.[5]

Challenges

edit

Benchmarking is not easy and often involves several iterative rounds in order to arrive at predictable, useful conclusions. Interpretation of benchmarking data is also extraordinarily difficult. Here is a partial list of common challenges:

  • Vendors tend to tune their products specifically for industry-standard benchmarks. Norton SysInfo (SI) is particularly easy to tune for, since it mainly biased toward the speed of multiple operations. Use extreme caution in interpreting such results.
  • Some vendors have been accused of "cheating" at benchmarks — designing their systems such that they give much higher benchmark numbers, but are not as effective at the actual likely workload.[6]
  • Many benchmarks focus entirely on the speed of computational performance, neglecting other important features of a computer system, such as:
    • Qualities of service, aside from raw performance. Examples of unmeasured qualities of service include security, availability, reliability, execution integrity, serviceability, scalability (especially the ability to quickly and nondisruptively add or reallocate capacity), etc. There are often real trade-offs between and among these qualities of service, and all are important in business computing. Transaction Processing Performance Council Benchmark specifications partially address these concerns by specifying ACID property tests, database scalability rules, and service level requirements.
    • In general, benchmarks do not measure Total cost of ownership. Transaction Processing Performance Council Benchmark specifications partially address this concern by specifying that a price/performance metric must be reported in addition to a raw performance metric, using a simplified TCO formula. However, the costs are necessarily only partial, and vendors have been known to price specifically (and only) for the benchmark, designing a highly specific "benchmark special" configuration with an artificially low price. Even a tiny deviation from the benchmark package results in a much higher price in real world experience.
    • Facilities burden (space, power, and cooling). When more power is used, a portable system will have a shorter battery life and require recharging more often. A server that consumes more power and/or space may not be able to fit within existing data center resource constraints, including cooling limitations. There are real trade-offs as most semiconductors require more power to switch faster. See also performance per watt.
    • In some embedded systems, where memory is a significant cost, better code density can significantly reduce costs.
  • Vendor benchmarks tend to ignore requirements for development, test, and disaster recovery computing capacity. Vendors only like to report what might be narrowly required for production capacity in order to make their initial acquisition price seem as low as possible.
  • Benchmarks are having trouble adapting to widely distributed servers, particularly those with extra sensitivity to network topologies. The emergence of grid computing, in particular, complicates benchmarking since some workloads are "grid friendly", while others are not.
  • Users can have very different perceptions of performance than benchmarks may suggest. In particular, users appreciate predictability — servers that always meet or exceed service level agreements. Benchmarks tend to emphasize mean scores (IT perspective), rather than maximum worst-case response times (real-time computing perspective), or low standard deviations (user perspective).
  • Many server architectures degrade dramatically at high (near 100%) levels of usage — "fall off a cliff" — and benchmarks should (but often do not) take that factor into account. Vendors, in particular, tend to publish server benchmarks at continuous at about 80% usage — an unrealistic situation — and do not document what happens to the overall system when demand spikes beyond that level.
  • Many benchmarks focus on one application, or even one application tier, to the exclusion of other applications. Most data centers are now implementing virtualization extensively for a variety of reasons, and benchmarking is still catching up to that reality where multiple applications and application tiers are concurrently running on consolidated servers.
  • There are few (if any) high quality benchmarks that help measure the performance of batch computing, especially high volume concurrent batch and online computing. Batch computing tends to be much more focused on the predictability of completing long-running tasks correctly before deadlines, such as end of month or end of fiscal year. Many important core business processes are batch-oriented and probably always will be, such as billing.
  • Benchmarking institutions often disregard or do not follow basic scientific method. This includes, but is not limited to: small sample size, lack of variable control, and the limited repeatability of results.[7]

Benchmarking principles

edit

There are seven vital characteristics for benchmarks.[8] These key properties are:

  1. Relevance: Benchmarks should measure relatively vital features.
  2. Representativeness: Benchmark performance metrics should be broadly accepted by industry and academia.
  3. Equity: All systems should be fairly compared.
  4. Repeatability: Benchmark results can be verified.
  5. Cost-effectiveness: Benchmark tests are economical.
  6. Scalability: Benchmark tests should work across systems possessing a range of resources from low to high.
  7. Transparency: Benchmark metrics should be easy to understand.

Types of benchmark

edit
  1. Real program
  2. Component Benchmark / Microbenchmark
    • core routine consists of a relatively small and specific piece of code.
    • measure performance of a computer's basic components[9]
    • may be used for automatic detection of computer's hardware parameters like number of registers, cache size, memory latency, etc.
  3. Kernel
    • contains key codes
    • normally abstracted from actual program
    • popular kernel: Livermore loop
    • linpack benchmark (contains basic linear algebra subroutine written in FORTRAN language)
    • results are represented in Mflop/s.
  4. Synthetic Benchmark
    • Procedure for programming synthetic benchmark:
      • take statistics of all types of operations from many application programs
      • get proportion of each operation
      • write program based on the proportion above
    • Types of Synthetic Benchmark are:
    • These were the first general purpose industry standard computer benchmarks. They do not necessarily obtain high scores on modern pipelined computers.
  5. I/O benchmarks
  6. Database benchmarks
    • measure the throughput and response times of database management systems (DBMS)
  7. Parallel benchmarks
    • used on machines with multiple cores and/or processors, or systems consisting of multiple machines

Common benchmarks

edit

Industry standard (audited and verifiable)

edit

Open source benchmarks

edit
  • AIM Multiuser Benchmark – composed of a list of tests that could be mixed to create a 'load mix' that would simulate a specific computer function on any UNIX-type OS.
  • Bonnie++ – filesystem and hard drive benchmark
  • BRL-CAD – cross-platform architecture-agnostic benchmark suite based on multithreaded ray tracing performance; baselined against a VAX-11/780; and used since 1984 for evaluating relative CPU performance, compiler differences, optimization levels, coherency, architecture differences, and operating system differences.
  • Collective Knowledge – customizable, cross-platform framework to crowdsource benchmarking and optimization of user workloads (such as deep learning) across hardware provided by volunteers
  • Coremark – Embedded computing benchmark
  • DEISA Benchmark Suite – scientific HPC applications benchmark
  • Dhrystone – integer arithmetic performance, often reported in DMIPS (Dhrystone millions of instructions per second)
  • DiskSpdCommand-line tool for storage benchmarking that generates a variety of requests against computer files, partitions or storage devices
  • Fhourstones – an integer benchmark
  • HINT – designed to measure overall CPU and memory performance
  • Iometer – I/O subsystem measurement and characterization tool for single and clustered systems.
  • IOzone – Filesystem benchmark
  • LINPACK benchmarks – traditionally used to measure FLOPS
  • Livermore loops
  • NAS parallel benchmarks
  • NBench – synthetic benchmark suite measuring performance of integer arithmetic, memory operations, and floating-point arithmetic
  • PAL – a benchmark for realtime physics engines
  • PerfKitBenchmarker – A set of benchmarks to measure and compare cloud offerings.
  • Phoronix Test Suite – open-source cross-platform benchmarking suite for Linux, OpenSolaris, FreeBSD, OSX and Windows. It includes a number of other benchmarks included on this page to simplify execution.
  • POV-Ray – 3D render
  • Tak (function) – a simple benchmark used to test recursion performance
  • TATP Benchmark – Telecommunication Application Transaction Processing Benchmark
  • TPoX – An XML transaction processing benchmark for XML databases
  • VUP (VAX unit of performance) – also called VAX MIPS
  • Whetstone – floating-point arithmetic performance, often reported in millions of Whetstone instructions per second (MWIPS)

Microsoft Windows benchmarks

edit

Unusual benchmark

edit

Others

edit
  • AnTuTu – commonly used on phones and ARM-based devices.
  • Byte Sieve - originally tested language performance, but widely used as a machine benchmark as well.
  • Creative Computing Benchmark – Compares the BASIC programming language on various platforms. Introduced in 1983.
  • Geekbench – A cross-platform benchmark for Windows, Linux, macOS, iOS and Android.
  • iCOMP – the Intel comparative microprocessor performance, published by Intel
  • Khornerstone
  • Novabench - a computer benchmarking utility for Microsoft Windows, macOS, and Linux
  • Performance Rating – modeling scheme used by AMD and Cyrix to reflect the relative performance usually compared to competing products.
  • Rugg/Feldman benchmarks - one of the earliest microcomputer benchmarks, from 1977.
  • SunSpider – a browser speed test
  • UserBenchmark - PC benchmark utility
  • VMmark – a virtualization benchmark suite.

See also

edit

References

edit
  1. ^ Fleming, Philip J.; Wallace, John J. (2025-08-07). "How not to lie with statistics: the correct way to summarize benchmark results". Communications of the ACM. 29 (3): 218–221. doi:10.1145/5666.5673. ISSN 0001-0782. S2CID 1047380.
  2. ^ Grambow, Martin; Lehmann, Fabian; Bermbach, David (2019). "Continuous Benchmarking: Using System Benchmarking in Build Pipelines". 2019 IEEE International Conference on Cloud Engineering (IC2E). pp. 241–246. doi:10.1109/IC2E.2019.00039. ISBN 978-1-7281-0218-4. Retrieved 2025-08-07.
  3. ^ "RDBMS Workshop: Informix" (PDF) (Interview). Interviewed by Luanne Johnson. Computer History Museum. 2025-08-07. Retrieved 2025-08-07.
  4. ^ "RDBMS Workshop: Ingres and Sybase" (PDF) (Interview). Interviewed by Doug Jerger. Computer History Museum. 2025-08-07. Retrieved 2025-08-07.
  5. ^ Software: HDDScan, GNOME Disks
  6. ^ Krazit, Tom (2003). "NVidia's Benchmark Tactics Reassessed". IDG News. Archived from the original on 2025-08-07. Retrieved 2025-08-07.
  7. ^ Castor, Kevin (2006). "Hardware Testing and Benchmarking Methodology". Archived from the original on 2025-08-07. Retrieved 2025-08-07.
  8. ^ Dai, Wei; Berleant, Daniel (December 12–14, 2019). "Benchmarking Contemporary Deep Learning Hardware and Frameworks: a Survey of Qualitative Metrics" (PDF). 2019 IEEE First International Conference on Cognitive Machine Intelligence (CogMI). Los Angeles, CA, USA: IEEE. pp. 148–155. arXiv:1907.03626. doi:10.1109/CogMI48466.2019.00029.
  9. ^ Ehliar, Andreas; Liu, Dake. "Benchmarking network processors" (PDF). {{cite journal}}: Cite journal requires |journal= (help)
  10. ^ Transaction Processing Performance Council (February 1998). "History and Overview of the TPC". TPC. Transaction Processing Performance Council. Retrieved 2025-08-07.

Further reading

edit
edit


胃部检查除了胃镜还有什么方法 绿意盎然是什么意思 树欲静而风不止是什么意思 吃什么容易胖起来 霉菌阴性是什么意思
广州有什么特产必带 什么降血糖 强盗是什么意思 什么叫溶血 嘛哩嘛哩哄是什么意思
处女座是什么星象 胎盘前置是什么意思 hpv去医院挂什么科 吃什么食物降低转氨酶 水瓶女和什么星座最配
连麦是什么意思 三月份是什么季节 伏案什么意思 雨字五行属什么 腊肉和什么菜炒好吃
红细胞体积偏高是什么意思hcv9jop1ns9r.cn 六九年属什么hcv8jop0ns2r.cn 内秀是什么意思hcv8jop9ns1r.cn 月季什么时候扦插最好huizhijixie.com 背锅侠是什么意思hcv8jop6ns7r.cn
发烧怕冷是什么原因beikeqingting.com 发烧喉咙痛吃什么药好hcv9jop6ns6r.cn 中国的国服是什么服装hcv9jop2ns1r.cn td代表什么意思hkuteam.com 缺钾有什么表现和症状hcv8jop7ns2r.cn
定日是什么意思aiwuzhiyu.com 乡镇镇长什么级别hcv8jop2ns6r.cn 甲减有什么症状表现mmeoe.com 低血压低是什么原因hcv9jop4ns9r.cn 干什么最挣钱hcv8jop3ns8r.cn
西周王陵为什么找不到hcv8jop2ns8r.cn 月经每个月都推迟是什么原因hcv9jop2ns3r.cn 1994年属狗是什么命hcv9jop2ns5r.cn 窦性心律不齐是什么jiuxinfghf.com 眼睛干涩发痒用什么药hcv8jop5ns4r.cn
百度