Modeling of Pattern Dependencies in the Fabrication of Multilevel Copper Metallization

By

Hong Cai

Submitted to the Department of Materials Science and Engineering in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY

June, 2007

© Massachusetts Institute of Technology, 2007. All Rights Reserved

Author

Materials Science and Engineering

March 1, 2007

Certified by

Duane S. Boning
Professor

Electrical Engineering and Computer Science

Certified by

Carl V. Thompson
Stavros Salapatas Professor
Materials Science and Engineering

Accepted by

Samuel M. Allen
POSCO Professor of Physical Metallurgy
Chair, Departmental Committee on Graduate Studies
Modeling of Pattern Dependencies in the Fabrication of Multilevel Copper Metallization

By

Hong Cai

Submitted to the Department of Materials Science and Engineering on February 13, 2007 in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Materials Science and Engineering

Abstract

Multilevel copper metallization for Ultra-Large-Scale-Integrated (ULSI) circuits is a critical technology needed to meet performance requirements for advanced interconnect technologies with sub-micron dimensions. It is well known that multilevel topography resulting from pattern dependencies in various processes, especially copper Electrochemical Deposition (ECD) and Chemical-Mechanical Planarization (CMP), is a major problem in interconnects. An integrated pattern dependent chip-scale model for multilevel copper metallization is contributed to help understand and meet dishing and erosion requirements, to optimize the combined plating and polishing process to achieve minimal environmental impact, higher yield and performance, and to enable optimization of layout and dummy fill designs.

First, a physics-based chip-scale copper ECD model is developed. By considering copper ion depletion effects, and surface additive adsorption and desorption, the plating model is able to predict the initial topography for subsequent CMP modeling with sufficient accuracy and computational efficiency. Second, a compatible chip-scale CMP modeling is developed. The CMP model integrates contact wear and density-step-height approaches, so that a consistent and coherent chip-scale model framework can be used for copper bulk polishing, copper over-polishing, and barrier layer polishing stages. A variant of this CMP model is developed which explicitly considers the pad topography properties. Finally, ECD and CMP parts are combined into an integrated model applicable to single level and multilevel metallization cases.

The integrated multilevel copper metallization model is applied to the co-optimization of the plating and CMP processes. An alternative in-pattern (rather than between-pattern) dummy fill strategy is proposed. The integrated ECD/CMP model is applied to the optimization of the in-pattern fill, to achieve improved ECD uniformity and final post-CMP topography.

Thesis Supervisor: Duane S. Boning
Title: Professor of Electrical Engineering and Computer Science
Thesis Supervisor: Carl V. Thompson
Title: Professor of Materials Science and Engineering
Acknowledgements

When I review my several years at MIT, many thanks go to the people and organizations who have supported me in my research and life.

First of all, I would like to acknowledge the financial support from the SRC/Sematech Engineering Research Center for Environmentally Benign Semiconductor Manufacturing and MagnaChip Semiconductor, previously Hynix Semiconductor Inc. My humble appreciation also goes to my collaborators in Korea, Mr. Hyungjun Kim, Mr. Youngsoo Kang, Mr. Sibum Kim, and Mr. Jeong-Gun Lee; although I never had the opportunity to meet you face to face, without your kindly help and guidance, this project would not have been successful.

I thank Professor Caroline Ross and my thesis committee, Professors Duane Boning, Carl Thompson and Eugene Fitzgerald. Your guidance and insight in my research gave me huge incentives and motivation. The contributions from you could not be overestimated in my thesis work. It is my great honor to work with you through the several years at MIT.

I also thank Mr. Joshua Tower from Philips Advanced Metrology Systems and Dr. Tamba Gbondo-Tugbawa, Dr. Kuang-Han Chen and Dr. Aaron Gower-Hall from Praesagus Inc. (now part of Cadence) for technology support. Without Mr. Tower’s help in obtaining copper thickness data, my project would have been stopped. The people from Praesagus provided significant metrology and layout software support. I could not remember how many times I wrote email to ask for your help. I really appreciate your contribution.

I also would like to thank my previous and current officemates. With you, the research group becomes a big family and I enjoy every day. I would like to show my whole-hearted appreciation to my fellow, Dr. Tae Park. You taught me, from the beginning, research skills as well as communication skills. Your help in communication with MagnaChip was the guarantee for our success in this project. Although you had left our group for a couple of years, I still could feel your voice and laugh in the office. Thanks to Xiaolin Xie, Daniel Truque, Diahyun Lim, Karthik Balakrishnan, Karen Gettings, Nigel Drego, Hayden Taylor, and Ajay Somani. We share many research discussions, laughs and Donuts.

Finally, very special thanks go to Huiwen Yang, the love of my life and my wife. I could not imagine my life at MIT without you. I wish the thesis could be my precious gift to you. My thanks also go to my parent and brother. Your supports from my remote hometown inspire me to overcome every difficult time.
Table of Contents

ABSTRACT .................................................................................................................. 3

ACKNOWLEDGEMENTS ............................................................................................. 4

CHAPTER 1 INTRODUCTION AND MOTIVATION FOR RESEARCH ............. 15
  1.1 Overview of Multilevel Copper Metallization ...................................................... 15
    1.1.1 Single Damascene ............................................................................................ 19
    1.1.2 Dual Damascene ............................................................................................ 20
  1.2 Copper ECD ....................................................................................................... 22
    1.2.1 ECD Mechanism and Model ......................................................................... 23
    1.2.2 Copper ECD Challenges and New Technologies ..................................... 29
  1.3 Copper CMP ...................................................................................................... 32
    1.3.1 CMP Tools ................................................................................................... 33
    1.3.2 CMP Mechanisms and Models .................................................................... 38
    1.3.3 CMP Challenges and New Technologies ........................................ 42
  1.4 Previous Research and Thesis Goals .............................................................. 52
  1.5 Thesis Organization ........................................................................................... 58

CHAPTER 2 METHODOLOGY AND EXPERIMENT PLAN ............................. 61
  2.1 Methodology .................................................................................................... 61
  2.2 Experimental Plan ............................................................................................ 65
  2.3 Measurement Plan ............................................................................................ 68
  2.4 Summary ........................................................................................................... 70

CHAPTER 3 TIME-STEPPEATED CHIP-SCALE ECD MODEL ...................... 71
  3.1 Previous ECD Model ....................................................................................... 72
  3.2 Experimental Data and Measurements for Model Calibration .................... 77
    3.2.1 Field Area Measurements ........................................................................... 77
    3.2.2 Line/Space Array Measurements ............................................................... 79
    3.2.3 Edge Effects in Array Scans ....................................................................... 90
  3.3 Terminology ..................................................................................................... 94
    3.3.1 Topography Variables ................................................................................... 94
    3.3.2 Layout Variables ......................................................................................... 96
  3.4 ECD Model Framework .................................................................................. 98
    3.4.1 Feature-Scale Model Review ...................................................................... 99
    3.4.2 Chip-Scale Model ..................................................................................... 107
  3.5 Model Calibration ........................................................................................... 119
    3.5.1 Calibration Results ................................................................................... 119
    3.5.2 HRP Scan Simulation ................................................................................. 130
    3.5.3 Chip-Scale Simulation ............................................................................... 139
  3.6 Model Verification ............................................................................................ 143
  3.7 M2 Modeling and Simulation .......................................................................... 145
List of Figures

Figure 1-1. Copper metallization characteristic of a six-level structure .................. 16
Figure 1-2. Circuit delay as a function of the feature size (low K=2) ......................... 17
Figure 1-3. Cross-section of hierarchical scaling ................................................. 18
Figure 1-4. Single damascene process flow ......................................................... 20
Figure 1-5. Dual damascene process flow .......................................................... 21
Figure 1-6. Principle of copper ECD ................................................................. 22
Figure 1-7. Evolution of hole filling with different deposition conditions ............... 23
Figure 1-8. Schematic representation of the hypothetical copper system ............... 23
Figure 1-9. Results of copper fill experiments. X: feature size, Y: Nominal field copper thickness ............................................................... 25
Figure 1-10. Additive behavior during copper ECD fill of narrow and deep feature ... 27
Figure 1-11. Filling contours predicted by the simple model (left) and the level-set code (right) ........................................................................... 29
Figure 1-12. Copper ECD pattern dependency .................................................... 31
Figure 1-13. Rotary CMP tool ........................................................................... 34
Figure 1-14. Orbital kinematics of Novellus Xceda CMP system ......................... 36
Figure 1-15. LAM LPT CMP system and linear kinematics of ................................ 37
Figure 1-16. Schematic of slurry thickness, friction force, and wafer/pad contact regimes for CMP processes ......................................................... 38
Figure 1-17. Schematic illustration of copper CMP mechanisms using (a) the conventional slurry and (b) the AFP solution ........................................... 39
Figure 1-18. Copper CMP process steps ............................................................ 45
Figure 1-19. Copper CMP pattern dependency ................................................... 46
Figure 1-20. Mechanisms of electropolishing and ECMP .................................... 50
Figure 1-21. Underlying topography effects in multilevel copper metallization .... 53
Figure 1-22. Multilevel test mask layout. Metal 1: blue (dark); Metal 2: magenta (light). .................................................................................................. 55
Figure 2-1. Methodology for M1 ECD and CMP .................................................. 62
Figure 2-2. Process flow for modeling and simulation of multilevel copper metallization ....................................................................................... 65
Figure 2-3. Measurement plan for electroplating and CMP: MIT 854 mask, level 1 ... 69
Figure 3-1. Step height and array height definitions in ECD ................................ 73
Figure 3-2. Surface area calculation ..................................................................... 75
Figure 3-3. 41-site total copper thickness measurements for MIT/SEMATECH 854 M1 wafers .......................................................... 78
Figure 3-4. Locations of 41-site for M1 copper thickness measurements ................ 79
Figure 3-5. Array scans and long scans for M1 HRP measurements ....................... 80
Figure 3-6. Bus structure of MIT/SEMATECH 854 M1 mask .............................. 80
Figure 3-7. Post-plating surface HRP array scans for CPT 115-01 (X: µm, Y: Å) .... 82
Figure 3-8. Post-plating surface HRP array scans for CPT 104-01 (X: µm, Y: Å) .... 83
Figure 3-9. Post-plating surface HRP array scans for CPT 115-04 (X: µm, Y: Å) .... 84
Figure 3-10. Post-plating surface HRP array scans for CPT 105-01 (X: µm, Y: Å) ... 85
Figure 3-11. Post-plating surface HRP array scans for CPT 115-07 (X: µm, Y: Å) ... 86
Figure 3-12. Post-plating surface HRP long scans for CPT 115-01 (X: μm, Y: Å) ...... 87
Figure 3-13. Post-plating surface HRP long scans for CPT 104-01 (X: μm, Y: Å) ...... 87
Figure 3-14. Post-plating surface HRP long scans for CPT 115-04 (X: μm, Y: Å) ...... 88
Figure 3-15. Post-plating surface HRP long scans for CPT 105-01 (X: μm, Y: Å) ...... 88
Figure 3-16. Post-plating surface HRP long scans for CPT 115-07 (X: μm, Y: Å) ...... 89
Figure 3-17. Surface topography profiles for fine pitch structures, in the electroplating process analyzed by Park .......................................................... 91
Figure 3-18. Center (left) and edge (right) HRP scans for CPT 104-01 (X: μm, Y: Å).... 92
Figure 3-19. Copper thickness above 0.18 and 0.25 μm line arrays for multiple dies and multiple wafers (courtesy Philips Analytical) ......................... 93
Figure 3-20. Topography variables ......................................................... 95
Figure 3-21. Envelope and step height illustration from HRP scan .......................... 96
Figure 3-22. Layout variables extraction .................................................. 97
Figure 3-23. Different weighting functions for calculation of effective pattern density .. 98
Figure 3-24. A schematic of the approximate geometry for the simple model ............. 104
Figure 3-25. Attenuation of bump formation during copper deposition .................... 106
Figure 3-26. Local copper growth rate in and over feature ................................ 107
Figure 3-27. Stack information for MIT/SEMATECH 854 M1 wafers ........................ 108
Figure 3-28. Profile evolution before trench overfill at feature level ...................... 109
Figure 3-29. Profile evolution after trench overfill at feature level ........................ 109
Figure 3-30. Simulation results for two-additive deposition of copper ..................... 110
Figure 3-31. Inter-feature cupric ion depletion effect at 10 sec ............................. 118
Figure 3-32. Inter-feature cupric ion depletion effect at 60 sec ............................ 118
Figure 3-33. Inter-feature cupric ion depletion effect at 140 sec ............................ 118
Figure 3-34. Array height data and simulation results for CPT104-01 ..................... 122
Figure 3-35. Array height data and simulation results for CPT115-04 ..................... 122
Figure 3-36. Array height data and simulation results for CPT105-01 ..................... 122
Figure 3-37. Array height data and simulation results for CPT115-07 ..................... 123
Figure 3-38. Effective step height data and simulation results for CPT104-01 ............ 123
Figure 3-39. Effective step height data and simulation results for CPT115-04 ............ 123
Figure 3-40. Effective step height data and simulation results for CPT105-01 .......... 124
Figure 3-41. Effective step height data and simulation results for CPT115-07 .......... 124
Figure 3-42. Copper electroplating profile for the wide trench and narrow line space ... 125
Figure 3-43. Field copper thickness data and simulation results for CPT104-01 ........ 126
Figure 3-44. Field copper thickness data and simulation results for CPT115-04 ........ 126
Figure 3-45. Field copper thickness data and simulation results for CPT105-01 ........ 126
Figure 3-46. Field copper thickness data and simulation results for CPT115-07 ........ 127
Figure 3-47. Nominal copper deposition rate vs. electroplating time for fine features .. 129
Figure 3-48. Average copper deposition rate vs. electroplating time for fine features .. 131
Figure 3-49. Array topography simulation for CPT 115-01 (X: μm, Y: Å) ............... 132
Figure 3-50. Array topography simulation for CPT 104-01 (X: μm, Y: Å) ............... 133
Figure 3-51. Topography simulation of several arrays for CPT 104-01 (X: μm, Y: Å) . 134
Figure 3-52. Topography simulation of various structures for CPT 104-01(X: μm, Y: Å) .......................................................... 135
Figure 3-53. Array topography simulation for CPT 115-04 (X: μm, Y: Å) ............... 136
Figure 3-54. Array topography simulation for CPT 105-01 (X: μm, Y: Å) ............... 137
Figure 3-55. Array topography simulation for CPT 115-07 (X: µm, Y: Å) .......................... 138
Figure 3-56. MIT/SEMAtech 854 M1 mask ................................................................. 140
Figure 3-57. Chip-scale ECD modeling at t=10 sec .................................................... 140
Figure 3-58. Chip-scale ECD modeling at t=20 sec .................................................... 140
Figure 3-59. Chip-scale ECD modeling at t=30 sec .................................................... 141
Figure 3-60. Chip-scale ECD modeling at t=40 sec .................................................... 141
Figure 3-61. Chip-scale ECD modeling at t=60 sec .................................................... 141
Figure 3-62. Chip-scale ECD modeling at t=100 sec ................................................... 142
Figure 3-63. Chip-scale ECD modeling at t=200 sec ................................................... 142
Figure 3-64. Chip-scale ECD modeling at t=350 sec ................................................... 142
Figure 3-65. MagnaChip internal verification mask ....................................................... 143
Figure 3-66. Array height data and simulation results for MagnaChip verification mask .... 144
Figure 3-67. Chip-scale ECD modeling for MagnaChip verification mask .......................... 144
Figure 3-68. MIT/SEMAtech 854 via mask ................................................................. 145
Figure 3-69. MIT/SEMAtech 854 M2 mask ................................................................. 146
Figure 3-70. Stack information for MIT/SEMAtech 854 M2 wafers ............................. 146
Figure 3-71. HRP array scans and long scans for post-CMP M1 and M2 wafers (M1: blue, M2: magenta) ................................................................. 147
Figure 3-72. M2 structures and HRP data extraction ..................................................... 148
Figure 3-73. Locations of S2-sites for M2 copper thickness measurements. (M1: blue, M2: magenta) ................................................................. 149
Figure 3-74. Array topography simulation for M2 pre-electroplating topography ............ 150
Figure 3-75. Uneven underlying topography ............................................................... 152
Figure 3-76. Array topography simulation for M2 post-electroplating topography ......... 153
Figure 3-77. Chip-scale ECD Modeling for MIT/SEMAtech 854 M2 ............................. 154
Figure 4-1. Shape evolution (top) and pressure evolution (bottom) in polishing a line array made of the same materials ................................................. 161
Figure 4-2. A Schematic of the Contact Model ............................................................. 162
Figure 4-3. Density-step-height model for single (top) and dual material polishing (bottom) ........................................................................................................... 166
Figure 4-4. Simulated pressure dependence on step height ......................................... 167
Figure 4-5. Contact height approximation by linearly fitting to contact wear calculations, giving \((line\ width)/(1-effective\ density)^{1/3}\) ................................................................. 168
Figure 4-6. Contact wear model in the Tugbawa’s model framework ......................... 170
Figure 4-7. Contact wear model in the new model framework ..................................... 172
Figure 4-8. Pad and asperity deformation at the feature scale ....................................... 173
Figure 4-9. Three-part strategy in the new model framework ....................................... 176
Figure 4-10. Modeling process flow chart .................................................................... 177
Figure 4-11. Definition of envelope and step height in the new model framework .......... 178
Figure 4-12. Rohm and Haas IC-1000 and Politex pads .............................................. 180
Figure 4-13. IC-1000 pad hierarchical roughness ....................................................... 181
Figure 4-14. Relative displacement of the asperity and the applied nominal pressure distributions ................................................................. 182
Figure 4-15. IC 1000 pad surface representative line scans and pad height probability distributions ............................................................................................................ 182
Figure 4-16. Examples of the peak fitting procedure ................................................... 183
Figure 4-17. Hypothetical asperity height probability distribution for a worn and new pad surface ................................................................. 184
Figure 4-18. Contact area of IC 1000 pad under pressure ................................. 185
Figure 4-19. Contact image of IC 1000 pad under pressure ................................. 186
Figure 4-20. Hypothetical pad asperity contact size distribution .......................... 187
Figure 4-21. Rough pad surface model for polishing ......................................... 188
Figure 4-22. Contact between asperities with relatively small and large trenches .... 190
Figure 4-23. Asperity bending over the whole pitch ........................................... 192
Figure 4-24. Asperity bending over line space .................................................. 193
Figure 4-25. Average cell pressure as a function of step height ............................ 194
Figure 4-26. Cu removal rate vs. down force for different Cu abrasive-free slurry solutions ............................................................... 196
Figure 4-27. Array topography simulation for MIT/ SEMATECH 854 M1 step 1 polishing at 20 sec (X: µm, Y: Å). Blue: HRP scan data, Red: grid top, Yellow: grid bottom ................................................................. 201
Figure 4-28. Array topography simulation for M1 step 1 at 40 sec ....................... 202
Figure 4-29. Array topography simulation for M1 step 1 at 50 sec ....................... 203
Figure 4-30. Array topography simulation for M1 step 1 at 100 sec ................. 204
Figure 4-31. Array topography simulation for M1 step 1 at 130 sec ................. 205
Figure 4-32. Array topography simulation for M1 step 1 at 160 sec ................. 206
Figure 4-33. Simulation of four array scans for M1 step 1 at 100 sec .............. 207
Figure 4-34. Chip-scale CMP modeling for M1 step 1 at t=0 sec .............. 209
Figure 4-35. Chip-scale CMP modeling for M1 step 1 at t=10 sec .............. 209
Figure 4-36. Chip-scale CMP modeling for M1 step 1 at t=20 sec .............. 210
Figure 4-37. Chip-scale CMP modeling for M1 step 1 at t=30 sec .............. 210
Figure 4-38. Chip-scale CMP modeling for M1 step 1 at t=40 sec .............. 210
Figure 4-39. Chip-scale CMP modeling for M1 step 1 at t=50 sec .............. 211
Figure 4-40. Chip-scale CMP modeling for M1 step 1 at t=60 sec .............. 211
Figure 4-41. Chip-scale CMP modeling for M1 step 1 at t=70 sec .............. 211
Figure 4-42. Chip-scale CMP modeling for M1 step 1 at t=80 sec .............. 212
Figure 4-43. Chip-scale CMP modeling for M1 step 1 at t=90 sec .............. 212
Figure 4-44. Chip-scale CMP modeling for M1 step 1 at t=100 sec ........... 212
Figure 4-45. Chip-scale CMP modeling for M1 step 1 at t=130 sec ........... 213
Figure 4-46. Chip-scale CMP modeling for M1 step 1 at t=160 sec ........... 213
Figure 4-47. Temperature impacts on C430 removal rate and friction force . 214
Figure 4-48. Endpoint detection motor current of a patterned STI wafer .......... 216
Figure 4-49. Array topography simulation for MIT/ SEMATECH 854 M1 step 1 polishing at 20 sec ................................................................. 218
Figure 4-50. Array topography simulation for MIT/ SEMATECH 854 M1 step 1 polishing at 40 sec ................................................................. 219
Figure 4-51. Array topography simulation for MIT/ SEMATECH 854 M1 step 1 polishing at 50 sec ................................................................. 220
Figure 4-52. Chip-scale CMP modeling for M1 step 1 at t=10 sec .............. 221
Figure 4-53. Chip-scale CMP modeling for M1 step 1 at t=20 sec .............. 221
Figure 4-54. Chip-scale CMP modeling for M1 step 1 at t=30 sec .............. 222
Figure 4-55. Chip-scale CMP modeling for M1 step 1 at t=40 sec .............. 222
Figure 4-56. Chip-scale CMP modeling for M1 step 1 at t=50 sec. ............................. 222
Figure 4-57. Sensitivity analysis for MIT/SEMATECH 854 M1 step 1 polishing at 20 sec (X: μm, Y: Å). ................................................................. 224
Figure 4-58. Envelope data and simulation results for MagnaChip mask step 1. .......... 228
Figure 4-59. Chip-scale CMP modeling for verification mask step 1 at t=0 sec. ......... 228
Figure 4-60. Chip-scale CMP modeling for verification mask step 1 at t=10 sec. ...... 229
Figure 4-61. Chip-scale CMP modeling for verification mask step 1 at t=20 sec. ....... 229
Figure 4-62. Chip-scale CMP modeling for verification mask step 1 at t=30 sec. ....... 229
Figure 4-63. Chip-scale CMP modeling for verification mask step 1 at t=50 sec. ....... 230
Figure 4-64. Chip-scale CMP modeling for verification mask step 1 at t=70 sec. ...... 230
Figure 4-65. Chip-scale CMP modeling for verification mask step 1 at t=80 sec. ...... 230
Figure 4-66. Chip-scale CMP modeling verification mask step 1 at t=90 sec. ......... 231
Figure 4-67. Chip-scale CMP modeling for verification mask step 1 at t=100 sec ....... 231
Figure 4-68. HRP array scan data for MIT/SEMATECH 854 M1 step 2 polishing at 30 sec ................................................................. 233
Figure 4-69. HRP array scan data for MIT/SEMATECH 854 M1 step 2 polishing at 50 sec ............................................................................... 234
Figure 4-70. Array topography simulation for MIT/SEMATECH 854 M1 step 2 polishing at 30 sec ................................................................. 235
Figure 4-71. Array topography simulation for MIT/SEMATECH 854 M1 step 2 polishing at 50 sec ................................................................. 236
Figure 4-72. Array topography simulation for MIT/SEMATECH 854 M1 step 2 polishing at 70 sec ................................................................. 237
Figure 4-73. Chip-scale CMP modeling for M1 step 2 at t=0 sec. ............................. 238
Figure 4-74. Chip-scale CMP modeling for M1 step 2 at t=30 sec. .......................... 238
Figure 4-75. Chip-scale CMP modeling for M1 step 2 at t=50 sec. .......................... 239
Figure 4-76. Chip-scale CMP modeling for M1 step 2 at t=70 sec. .......................... 239
Figure 4-77. Envelope data and simulation results for MagnaChip mask step 2. ....... 240
Figure 4-78. Chip-scale CMP modeling for verification mask step 2 at t=0 sec. ......... 241
Figure 4-79. Chip-scale CMP modeling for verification mask step 2 at t=30 sec. ...... 241
Figure 4-80. Chip-scale CMP modeling for verification mask step 2 at t=50 sec. ...... 241
Figure 4-81. Array topography simulation for MIT/SEMATECH 854 M2 step 1 polishing at 100 sec ................................................................. 243
Figure 4-82. Array topography simulation for MIT/SEMATECH 854 M2 step 1 polishing at 130 sec ................................................................. 244
Figure 4-83. Array topography simulation for MIT/SEMATECH 854 M2 step 1 polishing at 160 sec ................................................................. 245
Figure 4-84. Chip-scale CMP modeling for MIT/SEMATECH 854 M2 step 1 at 0 sec. ................................................................. 247
Figure 4-85. Underlying topography for MIT/SEMATECH 854 M2. .......................... 247
Figure 4-86. Chip-scale CMP modeling for MIT/SEMATECH 854 M2 step 1 at 100 sec. ................................................................. 248
Figure 4-87. Remaining copper thickness after copper polishing for M2. ............... 249
Figure 4-88. Array topography simulation for MIT/SEMATECH 854 M2 step 2 polishing at 30 sec ................................................................. 250
Figure 4-89. Array topography simulation for MIT/SEMATECH 854 M2 step 2 polishing at 60 sec .................................................................................................................. 251
Figure 4-90. Chip-scale CMP modeling for MIT/SEMATECH 854 M2 step 2 at 60 sec .................................................................................................................. 252
Figure 4-91. Remaining copper and barrier thickness after barrier polishing for M2... 253
Figure 5-1. Schematic showing traditional between-pattern and in-pattern dummy fills. (Left) Cross sectional view. (Right) Top down view........................................ 257
Figure 5-2. Strategy of reduced deposited copper thickness and in-pattern dummy fills. ................................................................................................................. 262
Figure 5-3. In-pattern dummy fill design #1................................................. 265
Figure 5-4. In-pattern dummy fill design #2................................................. 266
Figure 5-5. In-pattern dummy fill design #3................................................. 267
Figure 5-6. In-pattern dummy fill design #4................................................. 268
Figure 5-7. MIT/SEMATECH 854 M1 mask without dummy fills........................ 270
Figure 5-8. MIT/SEMATECH 854 M1 mask with dummy fills #3-A. .................. 270
Figure 5-9. MIT/SEMATECH 854 M1 mask with dummy fills #3-B. ................. 270
Figure 5-10. MIT/SEMATECH 854 M1 mask with dummy fills #3-C. ............... 271
Figure 5-11. Pattern density histogram plots. .................................................. 271
Figure 5-12. Post-electroplating topography maps without dummy fills. ............. 272
Figure 5-13. Post-electroplating topography maps with dummy fills #3-A. ......... 273
Figure 5-14. Post-electroplating topography maps with dummy fills #3-B. ......... 273
Figure 5-15. Post-electroplating topography maps with dummy fills #3-C. ......... 273
Figure 5-16. Histogram plots for average electroplated copper surface thickness.... 274
Figure 5-17. Removal rate vs. down force for the two hypothetical slurries........... 275
Figure 5-18. Post-CMP topography maps without dummy fills, using hypothetical abrasive-free slurry................................................................. 276
Figure 5-19. Post-CMP topography maps with dummy fills #3-A, using hypothetical abrasive-free slurry................................................................. 276
Figure 5-20. Post-CMP topography maps with dummy fills #3-B, using hypothetical abrasive-free slurry................................................................. 277
Figure 5-21. Post-CMP topography maps with dummy fills #3-C, using hypothetical abrasive-free slurry................................................................. 277
Figure 5-22. Copper thickness maps of dummy filled areas without dummy fills, using hypothetical abrasive-free slurry....................................................... 279
Figure 5-23. Copper thickness maps of dummy filled areas with dummy fills #3-A, using hypothetical abrasive-free slurry....................................................... 279
Figure 5-24. Copper thickness maps of dummy filled areas with dummy fills #3-B, using hypothetical abrasive-free slurry....................................................... 280
Figure 5-25. Copper thickness maps of dummy filled areas with dummy fills #3-C, using hypothetical abrasive-free slurry....................................................... 280
Figure 5-26. Histogram plots for copper thickness at dummy filled areas, using hypothetical abrasive-free slurry................................................................. 281
Figure 5-27. Histogram plots for effective copper thickness at dummy filled areas, using hypothetical abrasive-free slurry................................................................. 282
Figure 5-28. Copper thickness maps of dummy filled areas without dummy fills, using hypothetical conventional slurry....................................................... 283
Figure 5-29. Copper thickness maps of dummy filled areas with dummy fills #3-A, using hypothetical conventional slurry. ................................................................. 283
Figure 5-30. Copper thickness maps of dummy filled areas with dummy fills #3-B, using hypothetical conventional slurry. .................................................................... 284
Figure 5-31. Copper thickness maps of dummy filled areas with dummy fills #3-C, using hypothetical conventional slurry. .................................................................... 284
Figure 5-32. Histogram plots for effective copper thickness at dummy filled areas, using hypothetical conventional slurry. ................................................................. 285
List of Tables

Table 1-1. Generic process parameters for 90 nm process ................................................. 44
Table 1-2. MPU interconnect technology requirements—near-term years .......................... 47
Table 2-1 Design of experiment for time-step ECD model .............................................. 66
Table 2-2. Design of experiment for M1 ECD and CMP ................................................ 67
Table 2-3. Design of experiment for M2 ECD and CMP ................................................ 68
Table 3-1. ECD chip-scale modeling error for MIT SEMATECH 854 M ......................... 120
Table 4-1. CMP chip-scale modeling error for MIT SEMATECH 854 M1 .................... 200
Table 4-2. CMP chip-scale modeling error for MIT SEMATECH 854 M1 for the initial 50 sec polishing ................................................................. 217
Table 5-1. Impacts of size specifications of in-pattern dummy fills .................................. 268
Chapter 1

Introduction and Motivation for Research

This chapter presents the introduction and motivation for modeling of pattern dependencies in the fabrication of multilevel copper metallization. It is well known that multilevel topography, or surface height variation, resulting from pattern dependencies in various processes, especially copper Electrochemical Deposition (ECD) and Chemical-Mechanical Planarization (CMP), is a major problem in interconnects. An integrated pattern-dependence chip-scale model of multilevel copper metallization is needed to meet stringent industry specifications, and to optimize multilevel copper metallization processes to achieve environmentally benign fabrication, higher yield and performance, and to enable the optimization of layout and dummy filling designs. In Section 1.1, we first present an overview and background for multilevel copper metallization. Section 1.2 reviews the basic mechanisms and models for ECD, and Section 1.3 summarizes the CMP process. In Section 1.4, we discuss the motivation for this research project, including previous related work and a brief summary of the contributions of this thesis research. Finally, Section 1.5 presents the organization of the rest of the thesis.

1.1 Overview of Multilevel Copper Metallization

The standard aluminum-copper alloy has been the choice for interconnects in integrated circuits for over three decades. However, with severe dimension shrinkage and transistor performance improvements in integrated circuits, the interconnect delay,
reliability and manufacturability are becoming an important bottleneck in Ultra-Large-Scale-Integrated circuit (ULSI) performance and fabrication, especially at the gate lengths of 0.13 μm and below [1]. The migration to new material alternatives for interconnects has become inevitable in order to satisfy the stringent requirements set by the semiconductor industry and the International Technology Roadmap for Semiconductors [2]. The substitution of copper for the standard aluminum-copper alloy for interconnects is an exciting step in this transition. After over 10 years’ research and development, the copper era was realized in Fall 1997. IBM and Motorola each announced their revolutionary transition to copper interconnect technology at the 1997 IEEE International Electron Devices Meeting [3, 4]. Figure 1-1 shows an example of copper metallization in a six-level structure associated with a production 32-bit RISC Processor in the IBM CMOS 7S technology [5].

Figure 1-1. Copper metallization characteristic of a six-level structure [5].
Compared to the aluminum-copper alloy, copper has a much lower electrical resistivity and a much higher electromigration resistance corresponding to the relatively higher melting point. Thus copper wiring has advantages in RC delay, power dissipation, current density, reliability, and scalability over aluminum wiring, consistent with high-performance and high-density needs. In addition, the inter-level and intra-metal dielectrics in interconnects face similar scaling requirements, especially decreasing C (capacitance) to offset the increasing RC delay of global wiring as well as local and intermediate wiring with ever-increasing performance and ever-increasing density in integrated circuits [6]. The integration of low-resistivity copper and low-permittivity (K) dielectrics can provide further performance and reliability enhancement to meet the requirements of the next technology generations. Figure 1-2 compares circuit delay as a function of the feature size between Al/SiO2 and Cu/Low-K technologies.

![Figure 1-2. Circuit delay as a function of the feature size (low K=2) [6, 7].](image)

In order to realize high density and high performance integrated circuits, multilevel metal is required. Designers have generally adopted a hierarchical approach. In most
cases, each succeeding metal level (or groups of metal levels) increases in pitch and thicknesses to alleviate the impact of RC interconnect delay on performance, as shown in Figure 1-3. Local, intermediate and global wiring pitch and aspect ratios are differentiated to meet the different effects of scaling on each wiring level [2].

![Diagram of hierarchical scaling](image)

Figure 1-3. Cross-section of hierarchical scaling [2].

Multilevel copper metallization is critical in advanced interconnect technology. However, there are several fabrication challenges in achieving high yield and economical copper wiring in key process steps including copper deposition, patterning and planarization. Traditional aluminum interconnect fabrication technology cannot be transplanted to copper directly. In order to overcome fabrication challenges, several new technologies have been developed and introduced. A dual damascene approach using
chemical-mechanical polishing or planarization (CMP) with electrochemical deposition (ECD) is the predominant fabrication technique. CMP provides local and global planarization capability to substitute for dry etching of copper, which suffers from limited volatile copper compounds and other etching difficulties. ECD is very efficient in filling the damascene structure without voids and seams. The superconformal filling or superfilling ability is difficulty to achieve by other copper deposition methods, such as physical vapor deposition (PVD). Another advantage of dual damascene technology for copper interconnects is the lower cost compared with single damascene and subtractive pattern/etch approaches, which are used in aluminum/tungsten interconnect fabrication [8]. More details on damascene, ECD and CMP are presented in the following sections.

1.1.1 Single Damascene

Damascene is an ancient jewelry making process used to produce and polish inlaid metal [8], and now is the term used to describe the advanced technology for making inlaid metal lines and interconnects in integrated circuits instead of jewelry. The need for high performance and reliability prompted the conversion from the aluminum subtractive etch process to copper damascene.

The single damascene process is shown in Figure 1-4, where copper vias and copper lines are formed with separate pattern, fill, and CMP process step sequences. After an interlevel dielectric (ILD) is patterned and etched, a thin layer of barrier material and a copper seed layer are deposited, metal is deposited in the via holes defined within the ILD, and then excess blanket metal is removed by CMP. This single damascene copper via process is very similar to the process used for tungsten vias in conventional
aluminum/tungsten interconnect, which are also formed using a CMP damascene approach. In the second stage, additional etch stop and ILD layers are deposited. A similar process sequence to that used for via formation is then used to form trenches above the vias. The key point for single damascene is that the vias and trenches are formed in different steps, and thus two copper ECD and two copper CMP steps are required for each level of interconnect.

![Single damascene process flow](image)

Figure 1-4. Single damascene process flow [8].

1.1.2 Dual Damascene

Compared to the single damascene process, the dual damascene process fills vias and trench openings with barrier layer and copper seed layer deposition at the same time, after via and trenches openings are formed, as shown in Figure 1-5. In this way, only one
copper ECD step and one copper CMP step are required for every interconnect level in metallization. Considering the high cost of copper CMP, the lower fabrication cost compared to single damascene is significant. There are some other benefits from dual damascene beyond cost, such as lower via resistance (since the barrier between the trench wire and via can be omitted) and higher reliability/process yield [8]. The major problems for dual damascene lie in higher aspect ratios of the vias and trenches in etch and copper deposition steps. That is one reason to use copper ECD to fill the vias and trenches instead of other filling processes. In the next section, more details about ECD will be presented.

![Dual damascene process flow](image)

Figure 1-5. Dual damascene process flow [8].

There are several different dual damascene schemes demonstrated in the literature. These can be classified as “via first” or “trench first” depending on which is patterned first. In order to integrate low K materials with the CMP process, new dual damascene schemes have also been demonstrated, such as the “top dual hard mask” approach [8].
1.2 Copper ECD

Electroplating is the most suitable copper deposition method so far for the fabrication of damascene interconnects. It can inlay copper simultaneously in vias and trenches. The advantage of ECD compared to other potential fabrication processes for damascene copper interconnects is the extraordinary ability to fill trenches and vias completely, economically and efficiently [9]. Figure 1-6 shows the basic principle of a copper ECD system. By connecting with a DC power source and immersing in a copper ion solution with various additives, the wafer coated with a copper seed layer and the copper plate act as cathode and anode, respectively. Through electrical current, copper ions are deposited on the cathode, while the depleted copper ions in the solution are replenished from the copper anode. The reduction electrochemical reaction occurring at the cathode is \( \text{Cu}^{2+} + 2e^- \rightarrow \text{Cu} \). The reverse oxidation action occurs at the copper anode.

![Figure 1-6. Principle of copper ECD.](image)

A successful copper electroplating process with superfilling behavior leads to void-free, seam-free damascene. Sub-conformal or conformal electroplating deposition must be avoided in order to achieve high reliability. The void and seam formation in these processes are shown in Figure 1-7 [5]. Super-conformal deposition is also referred to as superfilling.
Figure 1-7. Evolution of hole filling with different deposition conditions [5].

1.2.1 ECD Mechanism and Model

The mechanism for copper electroplating is shown in Figure 1-8 [10]. First, Cu$^{2+}$ ions diffuse to the surface, and then react on the surface by a two-step electron transfer process. First, by capturing one electron, Cu$^{2+}$ becomes Cu$^+$ and is absorbed onto the surface. Second, these cuprous adions diffuse laterally to a second location on the surface and incorporate into the crystal lattice after capturing another electron.

Figure 1-8. Schematic representation of the hypothetical copper system [10].
A stable electrolyte solution in copper electroplating contains sulfate and sulfuric acid as well as organic additives. Rein [11] classified plating organic additives into three categories. Accelerators are mercapto-containing species and can enhance local current at a given voltage at the absorption position on the copper surface. Generally, the accelerator is strongly absorbed on the surface and displaces other less strongly absorbed additives. Suppressors are polymer-like polyethylene glycol and lead to a current-suppressing film on the wafer surface, especially in the presence of co-suppressors, chloride ions. The characteristics of suppressors are a high concentration in the bath and no consumption during plating (that is, no incorporation into the copper film). Levelers are another class of current-suppressing molecules which are generally added at a low concentration in the bath. They can be consumed or incorporated in the deposited film. The differences in these organic additives are very important in modeling the evolution of deposited copper profiles.

In order to explain the superfilling behavior in electroplating with levelers in the bath, various similar models have been proposed based on a diffusion-adsorption mechanism. Madore originally used such a mechanism to explain the superfilling for the case of nickel deposition in the presence of levelers. Andricacos [9] assumes that the additive is consumed at a rate controlled by mass transfer to an electrode surface, and this limits the whole electroplating process. Absorbed additives can strongly decrease the current and thus the electrodeposition rate. When the additives are consumed, diffusion can be a limiting step. A wide range of additive fluxes varying with local position brings the strong position dependence on the deposition rate: higher in the bottom of the trenches, lower on the sidewalls, lowest in the shoulders. West has applied a similar
model to simulate the copper deposition into high aspect ratio trenches [12]. The model for a single component system can be extended to multi-component systems. Cao [13] examines the effect of an accelerator bis(3-sulfopropyl)disulfide (SPS), a suppressor poly(ethylene glycol) (Cl-PEG), and a leveler Janus Green B (JGB) on the superfilling efficiency in submicrometer trenches. In the three-addictive model, a multicomponent version of a Frumkin isotherm model and a competitive adsorption Langmuir model are used to describe the interaction between SPS and PEG and the interaction between PEG and JGB, respectively. In the multi-component system, the diffusion-adsorption mechanism is no longer applicable if the leveler is absent. A new model is required to describe the superfilling recipe with bump formation as shown in Figure 1-9, which cannot be explained by the diffusion-adsorption mechanism.

Figure 1-9. Results of copper fill experiments [15]. X: feature size, Y: Nominal field copper thickness.
Reid [16] visualizes additive behavior during copper plating fill of a narrow and deep feature and explains the well-known bottom-up fill phenomena, as shown in Figure 1-10. He proposes that there is a primary mechanism due to several factors, as well as a secondary mechanism involving levelers, for the establishment and propagation of bottom-up fill. The primary mechanism is focused on the behavior of accelerator and suppressor. After the wafer is immersed in a plating solution without current flow, additives are adsorbed on the surface of the copper seed layer and reach an equilibrium concentration. Conformal deposition in the feature occurs in the initial stage due to the equilibrated additive concentration. Accelerators accumulate near the base of the trench, and displace less strongly adsorbed additives, such as suppressors. Further, rapid growth near the base causes more accelerator accumulation (build up) due to a decrease of surface area inside the filling feature. Accelerators cannot be incorporated into the deposited copper layer. When the fill of the trench is just complete, the copper over narrow and deep features have an adsorbed excess of accelerators. Thus superfilling can take place without the desorption of accelerators or levelers. The secondary fill mechanism relates to the control of mass-transfer of levelers. The levelers can enhance the suppression of current flow at surfaces if there is not limitation in the mass transfer of the leveler. In low mass-transfer areas, such as narrow or deep features, however, the leveler concentration is mass-transfer limited, which results in a higher deposition rate. Generally, the primary mechanism dominates the secondary mechanism if accelerators are present in the plating bath.
Figure 1-10. Additive behavior during copper ECD fill of narrow and deep feature [15].

Other people further quantify the primary mechanism for bottom-up filling or superfilling. West has proposed an accelerator-accumulation model to explain the two-component system with bump formation [14]. The decrease in the surface area available for additive adsorption as the plating progresses leads to a decrease in the surface concentration of suppressors; this assumes that superfilling is caused by an accumulation of accelerators on the surface. That means that at the bottom of a feature deposition acceleration or superfilling is due to a change in surface area as deposition proceeds. The
decrease in surface area results in an increase in the surface concentration of accelerators on the electrode surface, which excludes and lowers the surface fraction of suppressors. The lower the surface fraction of suppressors, the higher the local deposition rate.

A similar model, referred to as the curvature enhanced accelerator coverage (CEAC) mechanism, is employed by Moffat to explain bump formation in a specific superfilling recipe [16, 17, 18, 19]. Moffat et al. focus primarily on the role of the accelerator in their model, whereas West emphasizes that the slow surface dynamics are associated with suppressors. A dilute accelerator (thiol or disulfide derived from a 3-mercapto-1-propanosulfonate additive, MPSA) adsorbs strongly on the copper surface, and thereby displaces the more weakly bound depressor (Cl-PEG). All adsorbed additives are assumed to continue on or float at the surface during deposition. The accumulation of adsorbed accelerators results from reduction of surface area related to local surface curvature during growth. At the points of high positive curvature, particularly the bottoms of small vias, increased local deposition velocity is observed.

Based on the mechanics of the CEAC model, several methods have been applied to simulate the superfilling deposition process of one trench. Compared to 30 minute computation time for a deposition front-tracking simulation without considering the cupric ion and additive concentration position dependence near the trench surface, and the several hour computation time for the level-set approach that introduces the diffusion equations of the cupric ion and additives, the simple geometrical model can reduce the simulation time to less than one second by capturing the fundamentals of the near-optimized filling mechanics and using a first-order differential equation, as shown in Figure 1-11. The simulation results are in good agreement with the results from more
complex codes and experimental data for a broad range of parameter space with few
problems in predictive ability [20]. Another advantage is that it is easy to extract
quantities related to step height and array height for the topography characterization from
the simplified plating simulation model.

![Figure 1-11. Filling contours predicted by the simple model (left)
and the level-set code (right) [20].]

1.2.2 Copper ECD Challenges and New Technologies

There are numerous technological challenges for copper ECD with aggressive and
on-going shrinkage of feature size in semiconductor manufacturing. Several key
challenges for copper ECD will be briefly discussed in the following sections, together
with possible technological solutions. The discussion is ordered in wafer, chip and feature
scales.

Several concerns are centered on the process integration of copper ECD and CMP,
although most concerns have been focused on the wafer-scale. AMD proposed to align
the wafer-scale removal profile of CMP with the wafer-scale copper plating profile [21].
In particular, the goal is to compensate for wafer scale CMP nonuniformity (e.g., edge-fast polish) by a complementary nonuniformity in the plating profile (e.g., edge-thick deposition). The post-plating profile is preferred to be customized to match a specific copper CMP process and its removal profile on the wafer scale, especially near the wafer edge. This approach can provide better under-polishing and over-polishing control, improve process and electrical performance and thus increase the throughput. It requires substantial flexibility in profile control of the copper ECD process, especially at the wafer edge. Novellus’ SABRE Extreme ECD system targets 45 nm and below [22], and is an example of a tool designed to meet this need. This tool enables recipe-driven edge profile control for compatibility of CMP removal profile on wafer edge.

In terms of the chip-scale, pattern dependency in ECD is a key issue in plating. The nonuniform surface height after plating brings a number of problems to the subsequent CMP step, and increases dishing and erosion problems, which will discussed in the next session. As shown in Figure 1-12, various line and space sizes in the layout design produce different and uneven topography during the plating process, which is not desirable for CMP. The combination of features, and not simply individual feature characteristics (line width and line space) contribute to the post-plating topography. Thus, pattern dependency in ECD must be looked at as a chip-scale problem rather than just a feature scale issue. While line width and line space are the key factors impacting the electroplated topography, regional effects due to ion depletion also impact the plated thickness. The surface topography is characterized by two quantities, step height and array height, as shown in Figure 1-12. Step height is defined as the bottom depth of each line with respect to the nearby surface, and the array height is defined as the top surface
height of copper over an array region with respect to the flat copper field region over a wide oxide area [23].

Several approaches can be used to address or minimize pattern dependency issues in ECD. Based on the previously proposed copper ECD mechanism, adjusting additive type and quantity can affect the final topography. Additive adsorption and desorption are another controlling factor. Beyond the ECD recipe, layout design also can be tuned to optimize the final topography. One key method is to use dummy fill, which are added non-electrically active patterned features added solely for the purpose of improving process uniformity. For example, ACM proposed using dummy fill in the wide trenches to achieve a virtually flat post-plating surface, with the goal of enabling the use of its electro-polishing process instead of CMP to remove the copper overburden [24]. At the process integration level, co-optimization of ECD and CMP is another potential solution to reduce, if not eliminate, pattern induced topography variation. The co-optimization of ECD and CMP, and dummy fill strategy are key issues which will be addressed in Chapter 5.

Figure 1-12. Copper ECD pattern dependency [23].
At the feature-scale level, the major problem for copper ECD is high aspect-ratio fill capability. With feature size scaling, aspect ratio (depth of feature to width of feature) will be up to 10:1 in the near future. Improving the plating chemistry and ECD tool design will be the primacy solutions to achieve seamless fill of smaller geometry and higher aspect ratio structures [2]. Process modeling and simulation are helpful in identifying a process window which can achieve void-free fill. At the same time, other issues, including electromigration and electrical resistivity, require novel processes and new materials for barrier and seed layer deposition.

Minimization of the post-plating topography and excess copper thickness can substantially improve the subsequent CMP process. Nutool [25] developed an electrochemical-mechanical deposition (ECMD) process with the goal of improving the plating process compared to conventional ECD. This technology combines some features of CMP with ECD, and can be looked upon as a variant of ECD. The ECMD approach deposits copper in an electrolytic bath, similar to ECD, but also employs a polishing pad which contacts the growing copper. The principle is that copper deposition on the regions in contact with the pad is inhibited, enabling the deposition process in recessed areas to proceed with plating similar to the conventional ECD.

1.3 Copper CMP

Chemical-mechanical planarization or polishing (CMP) is a critical and dominant technology use for the fabrication of advanced copper interconnects in deep sub-micron integrated circuits. CMP removes excess materials from uneven topography on a wafer surface resulting from other processes, such as ECD and chemical vapor deposition (CVD) and planarizes the wafer surface. The flat polished surface makes subsequent
photolithography accurate by reducing the required depth of focus, and enables multilevel copper interconnects to be stacked with minimal height variations. CMP provides an economic and efficient way to planarize the wafer globally and locally in the copper era. The planarization capability of CMP has helped make it possible to continuously shrink feature size in recent technology generations.

CMP combines chemical and mechanical interactions to planarize metal and dielectric surfaces using a slurry composed of chemicals and/or sub-micron size abrasive particles [26]. Both chemical and mechanical parts contribute to the final material removal effect. It is not possible to clearly separate them due to synergistic effects, where abrasive action is enabled by chemical modification of the wafer surface. Many variables are involved in the CMP process, which makes CMP hard to be controlled and optimized.

1.3.1 CMP Tools

In CMP, a wafer is held by a carrier and pressed face down onto a platen covered with a polishing pad. The platen and carrier move relative each other in a rotary, linear or orbital fashion.

Figure 1-13 shows a schematic of a rotary CMP tool. The pad will become glazed during polishing, resulting in deceased polish rates and quality. A diamond-tipped conditioner is used to recover the polishing ability of the pad and prolong the lifetime of the pad. Rotary CMP tools dominate the current semiconductor equipment market. Applied Materials holds the majority of CMP market share, followed by Ebara and Novellus. Applied Materials’ tools are typically rotary, and are widely installed and characterized in the semiconductor industry. There are numerous pad and slurry consumable choices readily available in the market, and are continuously upgraded.
One key feature of rotary tools is that the rotation speed of the carrier can be matched to be the same as that of the platen, and in the same direction. This setting forces the relatively linear velocity between the pad and wafer to be equal on every point on the wafer. The uniform relative velocity profile can be advantageous in removal profile control; in practice, however, nonuniformities in pressure application, slurry distribution, and other wafer level dependencies typically result in optimization of both the platen and carrier velocities to different values to compensate.

A key issue for rotary tools, where the polishing platen is much larger than the wafer diameter, is that tool space efficiency, average pad utilization, and slurry unitization can be poor. In an effort to offset these disadvantages, additional kinematics have been introduced into volume production tools, such as carrier oscillation [27].

![Figure 1-13. Rotary CMP tool [23].](image)

One of the major CMP tool manufacturers, Novellus, has commercialized CMP systems using orbital kinematics. Figure 1-14 shows the orbital motions of the Novellus Xceda system. There are three motions in this system: orbital movement, rotation and oscillation. The major motion, orbital movement, means that the movement center of the platen is not the center axis of the platen. Since the distance between the two axes, the
orbital radius $R_o$, as shown in Figure 1-14, is only 1 to 3 inches, a high orbital speed, 500 to 1500 rpm, is required to reach a high enough relative velocity on the wafer. Without rotation of wafer carrier and platen oscillation, the relative velocity profile on the wafer is flat. In order to avoid some possible artifacts or a non-uniform removal rate profile across the wafer, other motions, including rotation of the wafer carrier and oscillation of the platen, are introduced into the system. The shortcoming in this case is non-uniform relative velocity on the wafer. The much lower speed in these two motions can limit the non-uniformity, and zonal pressure control can be used to compensate for non-uniform velocity profiles. There are some advantages for the orbital system compared to the more widespread rotary system. The platen is closer in size to the wafer, thus the footprint of the tool can be smaller and fab floor space utilization can be increased. Also, the tool can achieve improved slurry usage and has the possibility for rapid slurry change by using through-the-pad slurry delivery, as there is no need for rotary couplings for fluid delivery. In addition, the configuration can simplify the in-situ end pointing system, and can improve pad utilization and wears out the pad more evenly [27]. However, the high orbital speed as a result of the small orbital radius can result in problems in system stability and durability. The higher pad usage also can shorten pad lifetime, which can decrease the effective tool time due to the need for more frequent polishing pad changes [27]. In many cases, however, pad lifetime is not a key disadvantage compared to the rotary system, because the major factor affecting pad lifetime is pad conditioning pressure and time.
LAM developed a CMP system using a different combination of motion, referred to as linear kinematics. As shown in the Figure 1-15, a continuous polishing belt with a conventional polishing pad is driven by a motor at a continuous constant velocity. If the wafer does not rotate, the relative velocity on every point of the wafer is the same. The relative velocity can reach 2 m/sec without hydroplaning [27, 28]. Pressure control comes from an air bearing under the polishing belt, instead of from the wafer carrier as shown in Figure 1-15.
Hydroplaning is a phenomenon that occurs when the wafer is not directly in contact with pad asperities and instead slides on the thin slurry film. In real applications, the process window is selected to avoid hydroplaning due to extremely low material removal rates in the hydroplaning condition. There are three wafer/pad contact regimes: direct contact, semi-direct contact and hydrodynamic contact (hydroplaning). Slurry viscosity, relative velocity and applied pressure on the wafer will determine which of the contact regimes applies, which in turn determines the friction force (or coefficient of friction) and slurry film thickness as shown in Figure 1-16. The friction force can be related to the material removal rate. High applied pressure and low relative velocity are preferred to ensure at least semi-direct contact. However, the opposite requirements for low-K material compatibility polishing, where low pressure and higher velocity are preferred, make CMP challenging at the 65 nm and smaller technology nodes. Ways to avoid hydroplaning is an important issue for current tool designers.
1.3.2 CMP Mechanisms and Models

Abrasive-free polishing (AFP) is an important advanced technology in copper CMP. It is become mature in recent years and is entering mainstream use. AFP has the following advantages: higher selectivity of copper to barrier, lower dishing and erosion due to the absence of hard particles as in the conventional slurry, longer over-polishing window, low micro-scratching, low particle residue, easier post-CMP clean and treatment of waste stream, minimal slurry handling issues and good integration ability with existing polishing tools [29, 30]. A schematic diagram comparing the CMP of a copper film using a conventional slurry with abrasive particles and the abrasive-free polishing solution is shown in Figure 1-17. In the conventional CMP process, the slurry oxidizes the surface of the copper film. The hard and stiff copper oxide covers all copper areas on the wafer. In
order to remove the copper oxide layer, hard ceramic particles are introduced into the slurry. Under the pressure loaded by the pad, abrasives sweep the reaction layer and expose the fresh copper film to the reactive slurry. For Ta and oxide CMP, a passivation layer is formed at the top of the wafer by reaction with the slurry. Under the applied pressure, abrasive particles in the slurry will help to remove the passivation layer similar to the action with a conventional copper slurry. As for the AFP polishing, chemicals in the slurry are modified to form a softer corrosion-resistant complex on the top of the copper film, which can be removed by pad pressure and motion without the presence of abrasive particles.

![Diagram](image)

**Figure 1-17.** Schematic illustration of copper CMP mechanisms using (a) the conventional slurry and (b) the AFP solution [29].

Due to the complexity of the CMP process, understanding of the process remains incomplete. In the last ten years, tremendous effort has been spent to build mathematic models to understand the CMP process, ranging across the feature scale, chip scale, and wafer scale. However, each different scale has its own special issues and corresponding specialized modeling approaches. Due to the complex interactions in CMP, it is hard to develop a universal model to explain everything in CMP. Some efforts have been dedicated to develop a so called integrated model to connect the different scale together.
Generally, wafer-scale models focus on with-in-wafer non-uniformity and wafer-to-wafer non-uniformity. As for chip-scale models, pattern dependency is the principal concern. In feature-scale models, major issues are the material removal mechanism and consumable effects on the CMP process. Here, some material removal models are briefly reviewed. More details about chip-scale models will be discussed in Chapter 4.

The fundamentals of the CMP polishing mechanism, especially the polishing rate dependencies, are critical in CMP development and process optimization for advanced interconnect fabrication. Preston's equation, which was developed to explain glass polishing \cite{32}, has been commonly used as the fundamental polishing rate equation in modeling of CMP pattern density dependence. In Preston's equation, the material removal rate ($RR$) is described as:

$$RR = KpPV$$  \hspace{1cm} (1-1)

with $P$ being the applied pressure, $V$ the relative velocity between the polishing pad and the polished wafer, and $K_p$ Preston's coefficient.

However, such a linear dependence of material removal rate on pressure is not necessarily applicable to advanced CMP processes. Many non-Prestonian equations have been suggested to characterize advanced CMP processes. Tseng and Wang \cite{33} propose an analytical CMP model which describes the material removal rate as:

$$RR = MP^{5.6}V^{1.2}$$  \hspace{1cm} (1-2)

where $M$ is the weighting factor due to removal rate from other processes such as slurry attack. Wrschka \textit{et al.} \cite{34} generalize the equation to $RR = MP^aV^b$, where $a$ and $b$ are two parameters found by fitting the experiment data. Zhao and Shi \cite{35} develop a material
removal rate equation, noting the real contact area between the pad and the wafer, with a sublinear dependence on the pressure:

\[ RR = K(V)P^{2/3} \]  

(1-3)

where \( K(V) \) is a function of the relative velocity \( V \) and other CMP parameters. Furthermore, the tribological interaction among abrasive particles, the polished wafer surface, and the polishing pad introduces a threshold pressure into their equation:

\[ RR = K(V)(P^{2/3} - P_{th}^{2/3}) \]  

(1-4)

Ahmadi and Xia [36] use mechanical contact theory and develop a model for interactions of pad asperities with abrasive particles and wafer. The removal rate dependence on the pressure and velocity is related to the distribution of pad asperities. When the pad asperities have a random distribution, Preston’s equation is valid, while sublinear equations, such as Zhao and Shi’s equations, are required if the pad asperities have a wavy distribution.

The pad asperity distribution or roughness has also been applied to the pattern dependence in CMP. Vlassak uses a contact mechanics analysis to evaluate the local pressure distribution between features on the wafer and the polishing pad based on the compliance of the pad and its roughness, and thus predict dishing and erosion during CMP which are controlled by this local pressure distribution [37]. Most of the effects of pattern density, line width, applied down-force, selectivity, and pad properties on both dishing and erosion evaluated by using the model are in good agreement with the available data by Tugbawa [38]. The model captures the physical fundamentals in the CMP process related to pressure dependence.
1.3.3 CMP Challenges and New Technologies

As a key process in multilevel copper metallization, CMP faces tremendous challenges in current and future technology nodes. Low down force and low shear force polishing for compatibility with new materials, polishing planarity and material loss, and cost of ownership (COO) are the major issues to be addressed in the near future. Other challenges, such as wafer cleaning, defectivity, and waste processing are also important but will not be discussed here.

Low-K materials are required at the 65 nm technology node and below. The aggressive switch from oxide to low-K materials brings with it a number of manufacturing problems, especially one associated with CMP. Shear force generally is determined by down force and coefficient of friction. Due to the low resistance of low-K materials to shear force, low polishing down force, at 1 psi or below, is believed to be needed in the next generation CMP process to avoid delamination and cracking [39], although there is some progress in improving low-K material mechanical strength [39]. The increase in the number of copper interconnect levels further motivates the need for low down force. Some researchers claim that the stress accumulation at the barrier/dielectric interface due to high down force will cause delamination in the subsequent barrier polishing step even with low pressure, even with no delamination in the bulk copper polishing steps [39]. However, low-K materials are not the only driving force for low down force. The smaller pad bending under lower down force also decreases dishing of open features.
In the next several paragraphs, the copper CMP process flow and polishing performance definitions will be discussed first, then the polishing performance requirements will be listed and reviewed.

Most copper CMP processes have two polishing steps: bulk copper removal followed by diffusion barrier layer removal, as shown in Figure 1-18. The final step is to buff, clean and passivate the wafer for corrosion prevention [40]. There are several stages in bulk copper removal. First, most of the bulk copper is removed under comparatively high down force and high relative velocity, until there is about 2000 Å of the copper film remaining. Then, lower down force (~1 psi) and lower relative velocity are used to remove the remaining thin copper film over the field dielectric. This approach, called soft landing, can decrease defects and dishing. Usually, the soft landing stage will continue for tens of seconds after the barrier layer is first exposed until no copper remains above the dielectric across the whole wafer. This additional time is called the over-polishing time. Sometimes, a different polishing slurry with high selectivity to barrier and copper (i.e., having lower barrier and copper removal rates compared to dielectric removal rate) and perhaps a different polishing pad are used in over-polishing, especially when the copper thickness is not uniform across the wafer or when underlying topography exists, due to other process steps or previous interconnect level manufacturing. However, it is generally not cost-efficient and throughput efficient to switch to another platen for over-polishing. In the second step in the CMP process, the barrier layer is removed. In this step, a common approach is to use a non-selective slurry and continue polishing to remove the underlying dielectric, oxide or low-K material, to a depth of about 100-200 Å, after the barrier layer is gone. The purpose is to help correct underlying topography in
multilevel copper metallization although at the cost of greater overall copper line thickness loss, and also to remove dielectric faceting from other processes used in dual-damascene processing [40]. A widely used CMP tool is the Reflexion LX from Applied Materials, which utilizes three platens and one integrated cleaner [41]. The three platens are sequentially used for bulk copper removal, copper clearance and over-polishing, and finally barrier removal and wafer rinsing. Table 1-1 shows the generic process parameters for a 90 nm process [41]. Two platens are used in Novellus’s Xceda CMP tool: one for bulk copper removal, soft landing and over-polishing; and one for barrier removal. In contrast, Ebara’s FREX300 tool performs all CMP steps, copper bulk removal, copper clear/soft landing and barrier removal, on a single platen [40]. Apparently, the plate number used in all step of copper CMP has major impacts on the throughout.

Table 1-1. Generic process parameters for 90 nm process [41].

<table>
<thead>
<tr>
<th>Platen 1: Bulk Copper Removal</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>Film Thickness Å</td>
<td>8000</td>
</tr>
<tr>
<td>Removal Rate Å/min</td>
<td>8000</td>
</tr>
<tr>
<td>Slurry Flow Rate ml/min</td>
<td>250</td>
</tr>
<tr>
<td>Slurry cost $/litre</td>
<td>11</td>
</tr>
<tr>
<td>Pad</td>
<td>IC-1010</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Platen 2: Copper Clearance + Over-polishing</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>Film Thickness Å</td>
<td>2000</td>
</tr>
<tr>
<td>Removal Rate Å/min</td>
<td>2500</td>
</tr>
<tr>
<td>Slurry Flow Rate ml/min</td>
<td>250</td>
</tr>
<tr>
<td>Slurry cost $/litre</td>
<td>14</td>
</tr>
<tr>
<td>Pad</td>
<td>IC-1010</td>
</tr>
<tr>
<td>Copper Clearance Time (sec)</td>
<td>40</td>
</tr>
<tr>
<td>Over-polishing Margin (%)</td>
<td>15</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Platen 3: Barrier Removal + Buff</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>Film Thickness Å</td>
<td>250</td>
</tr>
<tr>
<td>Removal Rate Å/min</td>
<td>500</td>
</tr>
<tr>
<td>Slurry Flow Rate ml/min</td>
<td>250</td>
</tr>
<tr>
<td>Slurry cost $/litre</td>
<td>14</td>
</tr>
<tr>
<td>Pad</td>
<td>Politex</td>
</tr>
<tr>
<td>Polish time (sec)</td>
<td>30</td>
</tr>
</tbody>
</table>
Copper CMP suffers from problems of dishing and erosion, which are heavily dependent on underlying layout pattern density and feature size, as shown in Figure 1-19. The key definitions related to pattern dependency which characterize CMP polishing performance and planarity are dishing, erosion, copper loss and dielectric loss. Dishing refers to the copper loss relative to the level of the neighboring dielectric space; usually only wide trenches or open structures have significant dishing. Erosion is another key issue, and is defined as the relative oxide/dielectric loss relative to the dielectric surface level of neighboring field areas. Copper and oxide loss are the total copper and oxide removal, referenced to the bottom of the barrier layer. The major sources of dishing and erosion are from post-plating topography (copper ECD pattern dependency) and over-polishing which depends on slurry selectivity and over-polishing time. Barrier polishing time and slurry selectivity also have a key role in dielectric loss, as well as additional copper loss. Dishing of wide trenches or open structures is often seen as the most critical
and insidious problem due to depth of focus problems in lithography from significant non-planarity [40]. The key variables to control dishing, erosion, and copper/dielectric loss are post-plating topography, copper thickness, copper slurry and barrier slurry selectivity, pad mechanical properties and asperity distribution, over-polishing and barrier polishing time, and polishing pressure.

Another solution to pattern dependencies is dummy filling. The layout density will be changed by adding non-functional features, called dummy fill. Some companies, like Praesagus, Inc. (San Jose), are working to develop density-based dummy fill methodologies to address pattern dependencies from ECD and CMP [40]. Dummy filling strategies are also implemented in shallow-trench isolation CMP processes to offset similar planarity problems [42].

![Copper CMP pattern dependency](image)

Figure 1-19. Copper CMP pattern dependency [23].

Dishing on a 100 μm feature and erosion on a 90% dense feature are widely accepted in the semiconductor industry as indicators of CMP process goodness with respect to pattern dependencies. With scaling of the trench thickness, erosion and dishing have to be minimized correspondingly to meet performance requirements. It is a major challenge to reach the requirements set by the International Technology Roadmap for Semiconductors.
(ITRS), 2005 edition, which is increasingly stringent at 65 nm and below. Table 1-2 shows the desirable goals of CMP pattern dependencies [2].

Table 1-2. MPU interconnect technology requirements—near-term years [2].

<table>
<thead>
<tr>
<th>Year of Production</th>
<th>2005</th>
<th>2006</th>
<th>2007</th>
<th>2008</th>
<th>2009</th>
<th>2010</th>
<th>2011</th>
<th>2012</th>
<th>2013</th>
</tr>
</thead>
<tbody>
<tr>
<td>Cu thinning at minimum pitch due to erosion (nm), 10% × height, 50% areal density, 500 µm square array</td>
<td>15</td>
<td>13</td>
<td>12</td>
<td>11</td>
<td>9</td>
<td>8</td>
<td>7</td>
<td>6</td>
<td>6</td>
</tr>
<tr>
<td>Cu thinning at minimum intermediate pitch due to erosion (nm), 10% × height, 50% areal density, 500 µm square array</td>
<td>17</td>
<td>14</td>
<td>13</td>
<td>11</td>
<td>9</td>
<td>8</td>
<td>7</td>
<td>7</td>
<td>6</td>
</tr>
<tr>
<td>Cu thinning of maximum width global wiring due to dishing and erosion (nm), 10% × height, 80% areal density</td>
<td>250</td>
<td>260</td>
<td>260</td>
<td>260</td>
<td>280</td>
<td>280</td>
<td>280</td>
<td>250</td>
<td>250</td>
</tr>
<tr>
<td>Cu thinning global wiring due to dishing (nm), 100 µm wide feature</td>
<td>24</td>
<td>21</td>
<td>19</td>
<td>17</td>
<td>15</td>
<td>14</td>
<td>13</td>
<td>13</td>
<td>10</td>
</tr>
</tbody>
</table>

White: Manufacturable solution exists, and are being optimized
Yellow: Manufacturable solution are known
Red: Manufacturable solution are NOT known

Cost of ownership (COO) is a major long-lasting concern for CMP due to the relatively high cost compared to others processes. Currently, the standard COO, including cost of consumables (COC) and depreciation of tools, is US$15 per 12 inch wafer. There is significant pressure to decrease the high COC, which accounts for more than 75% of COO [41]. However, the portion of cost due to depreciation of tools will likely increase in the future as CMP tools become more sophisticated. Chipmakers are requesting high throughput tools for 65 nm and below. The goal of 40 to 60 wafers per hour is a difficult challenge, considering the current standard value of 28 wph of the Applied Reflexion LK. New innovations and designs of tools are demanded to further increase tool productivity and decrease COC, and consumable manufacturers are facing aggressive pressure from chipmakers to cut cost as well as increase pad and slurry performance.

In order to improve polishing performance and compatibility with low-κ dielectrics, cost-efficient low down force CMP polishing techniques have to be further developed.
Several potential technologies, such as ACM’s electropolishing technology and Applied Materials’ electrochemical-mechanical polishing (ECMP) are competing fiercely to occupy the mainstream position in the next node.

ACM Research proposes that electropolishing can be the solution for the next generation copper planarization [24]. Electropolishing can be looked at as the reverse process of electroplating. The surface copper on the wafer, acting as an anode under external applied voltage, is converted to copper ions by losing electrons, which then dissolve into the electrolyte. The voltage will determine the current density, and the copper removal rate is proportional to the current density. This stress-free, non-contact process is friendly to low-K dielectrics. However, the price is low planarization capability due to nearly equal removal rates at field, protruding, and recessed locations [43]. There is no dielectric loss or erosion problem, however, since electropolishing is inert to non-conductive materials. Thus an initially flat topography is necessary to limit dishing due to low planarization ability. The use of conventional CMP to remove steps in the bulk copper prior to electropolishing is a potential solution, although this appears to require two equipment sets and steps. Another approach is to place dummy fill in wide structures to flatten the post-plating topography [24]. However, the problem of low planarization ability still limits the application of electropolishing in manufacturing.

Applied Materials’ electrochemical-mechanical planarization, or ECMP, is seeking to solve the problems of CMP and electropolishing, while keeping the advantages of these two processes by combining electrochemical copper removal with conventional CMP [44]. ECMP has demonstrated its potential as the next generation planarization solution with low-K compatibility, low dishing and erosion, and good planarization capability. At
least, ECMP can serve as a complement to CMP if not replacement. In ECMP, the wafer with an applied voltage is submerged in an electrolyte, as in electropolishing. A specially designed chemical is added into the electrolyte that passivates the surface of copper to block copper dissolution. A rotating polishing pad then softly removes the contacted copper-complex passivation layer only on raised copper areas to open the path for copper dissolution under the applied voltage. The recessed areas are protected by the passivation layer and remain untouched. A similar mechanism is observed in abrasive-free polishing, except that the passivation coating in ECMP may be much softer. The component of CMP, including passivation coating and its mechanical removal, provides good planarization capability under low down force down to 0.3 psi. The copper removal rate is dominated by the applied voltage rather than the applied pressure from the pad [44], because the copper removal process is electrochemically-assisted instead of mechanically-assisted. Production ECMP tools provide a combination of CMP and electropolishing platens [44]. To improve wafer level uniformity, Applied Materials’ ECMP tools have three different zones across the wafer to apply independent voltages. By tuning the voltages, the wafer level removal profile can be optimized to follow the incoming film thickness profiles from ECD and other processes [40]. Figure 1-20 compares the differences in polishing mechanism between electropolishing and ECMP.
ECMP is a promising technology innovation for copper planarization and has a number of advantages that stand out. First, the inherent low down force, down to 0.3 psi, maintains good contact between pad and copper surface while keeping ECMP friendly to fragile low-K dielectrics used at 65 nm and below. This excellent compatibility to low-k materials makes ECMP superior to conventional CMP. The intrinsic high and voltage-controlled pressure-independent removal rate improves tool throughput, and provides a removal profile control ability to minimize copper bulk polishing time and improve the polished topography. The shorter polishing time, the cheaper electrolyte compared to expensive copper slurry, and the longer pad lifetime due to low applied pressure give ECMP the edge in COC and productivity over conventional CMP. At the same time, high planarization efficiency of ECMP helps create a highly planar polished surface near the completion of the bulk copper ECMP removal step, which assists in achieving low dishing and erosion and low defectivity. With this high planarization capability, it is possible to minimize the plated copper thickness from twice the trench depth, down to
about a factor of 1.2 times the trench depth, which improves ECD performance and productivity [46].

However, ECMP still has some limitations in its application to current semiconductor technology. Two conventional CMP steps are still required, following the bulk copper removal by ECMP, in order to achieve copper clearance and then barrier removal. The resulting multi-step process flow limits the advantages of ECMP, and offsets many of the potential advantages of ECMP in COC and productivity. The post-ECMP planarity eliminates one major source of dishing and erosion, the incoming post-plating topography. However, the over-polishing using conventional CMP required to clear the copper from field regions to account for both chip-scale topography and wafer-scale nonuniformity can still introduce significant dishing and erosion. The advantage of high removal rate in ECMP will be offset by the low removal rate in copper clearance and barrier removal steps under the required low pressure. Based on these limitations, substantial interest remains in using or improving conventional CMP to remove copper, barrier and dielectric efficiently under low pressure, as well as in extending ECMP to copper clearing and barrier removal stages.

An active technology path being investigated in the industry is to improve and extend conventional CMP to meet future requirements. These approaches include the use of conventional CMP under low pressure and high linear relative velocity, and to explore the use of more chemically active slurries. Based on Preston’s relation, reasonable removal rates can be reached by increasing relative velocity under low down force, so long as the hydroplaning phenomena can be avoided. Orbital CMP tools have some advantages over rotary tools in terms of high relative velocity. Upgraded orbital tools can
achieve linear velocities from 2 m/sec to 8 m/sec or even higher. The key issue is to find process windows which avoid hydroplaning at low pressure and high velocity. The methods for improving slurry chemical activity are similar to the idea of abrasive-free polishing (AFP). Nikon researchers report the use of high speed conventional CMP and a highly chemically active slurry to obtain copper removal rates larger than 2000 Å/min under an ultra-low down force pressure of 0.05 psi [47]. However, the key concerns for this strategy are the limited removal rate and relatively high dishing and erosion. In order to further improve the polishing productivity and planarity to compete with ECMP, dummy filling will likely be required, either alone or in conjunction with CMP and ECD process improvements. In Chapter 5, a strategy for dummy filling will be discussed, based on the co-optimization of the ECD and CMP processes.

1.4 Previous Research and Thesis Goals

As critical processes in multilevel copper metallization, ECD and CMP both suffer from pattern dependency problems. The initial non-uniform post-plating topography has a direct impact on the CMP polishing behavior and can exacerbate non-uniformity problems in CMP. The polishing non-uniformity primarily resulting from dishing and erosion leads to considerable surface topography, contributing to potential yield, reliability and manufacturability problems as well as process integration difficulties with other processes such as lithography. These major challenges in advanced copper metallization processes can be aggravated by the multilevel interconnect structures prevalent in real chips. The surface non-planarity of lower level copper metallization can influence the higher level topography and make it more uneven [23], as seen in Figure 1-21. The copper thickness variations resulting from dishing, erosion, and multilevel
topography result in non-uniform copper line resistance; these variations are a major concern in integrated circuit design and manufacture [48].

Modeling of pattern dependencies in the fabrication of multilevel copper metallization is demanded in order to understand the fundamental limitations of interconnect fabrication technologies, to identify yield or performance problem spots on product chips, and to assist in new process design with enhanced robustness, reliability and manufacturability.

Although several CMP models have been proposed at the feature scale, chip scale and wafer scale, few of them involve integration with other critical processes in interconnect fabrication, particularly electroplating. In addition, most of the existing models do not consider the case of multilevel copper metallization. Park [23] and Tugbawa [49] have developed an integrated chip-scale model of pattern dependencies in copper electroplating and copper CMP process. Their model integration is focused on the first metal level, and the multilevel case is not completely researched.

There are additional shortcomings in the previous integrated ECD/CMP model [23, 49]. The previous ECD and CMP models and different steps in CMP are not easily integrated. For example, different versions of the CMP model are developed for each of the different stages of the CMP process (bulk removal, copper clearing and over-polishing, and barrier removal). Different terminology, definitions, and representations of
surface topography are used in the ECD and CMP models. These models also need to be extended to consider physical details in these processes for better accuracy and process flexibility. In Park’s ECD model, several problems affect the model accuracy and flexibility: the potential for over-fitting due to an empirical model structure; no clear definitions for line space, line width and field thickness in a random product layout; no consideration of copper ion inter-feature depletion effect; and the use of an empirical model for final thicknesses, lacking a physics based model for deposition rates. Similarly, there are several major problems in the previous CMP model: the empirical step-height model does not have a physical basis to fully explain the relation between the contact heights and the layout information; pad mechanical properties are largely implicit in the model and cannot model a priori the effects of different pad parameters; and pad surface conditions, especially the distribution of the pad asperities, are ignored. While the previous ECD and CMP models provided adequate accuracy for the previously existing plating and polishing processes, greater accuracy is needed for modern and emerging technologies which have reduced, but still critically important, dishing, erosion, and surface topography variations.

Our group has previously designed the well known MIT/SEMATECH 854 test masks for multilevel copper metallization. The first level masks have been widely used in the semiconductor industry for process characterization of ECD and CMP pattern dependencies. The three layouts or masks, for metal 1 (M1), via and metal 2 (M2) layers, can be used to develop integrated ECD and CMP models. The masks have copper line arrays with feature sizes in the range of 0.18 µm to 100 µm. and layout densities in the range of 1% to 99% [23]. This test mask set enables the characterization of the coupled
pattern dependencies in copper electroplating and CMP processes. The test structure layout for M1 and M2 is shown in Figure 1-22.

![Multilevel test mask layout](image)

Figure 1-22. Multilevel test mask layout [23]. Metal 1: blue (dark); Metal 2: magenta (light).

Although the previous research provides a good starting point for this thesis research, significant improvements and innovations are needed to reach the goal of modeling chip-scale thickness variations in multilevel copper metallization.

First, it is critical to have a systematic methodology to characterize the integrated effects of ECD and CMP, and the multilevel effects in copper metallization. This new
methodology is a natural extension of the previous successful methodology and focuses on improvement in the multilevel metallization case to accommodate a pre-existing surface topography such as that created by previous CMP steps. A carefully designed experiment plan is needed in order to realize the methodology, build the updated integrated models and validate the models by simulating the similar processes on different layout designs.

The new physics-based chip-scale copper ECD model must consider copper ion depletion effects, and surface additive adsorption and desorption by incorporating physical principles from existing successful feature-scale ECD models. The selection and simplification of a feature-scale ECD model must provide not only adequate accuracy, but must also achieve high computational efficiency in order to be used for full-chip simulation. In copper CMP, the integration of contact wear and density-step-height models needs to be more seamlessly implemented in an improved and coherent chip-scale model framework for copper bulk polishing, copper over-polishing, and barrier layer polishing. The pad topography properties, including pad bulk and surface mechanical properties and asperity statistical distribution, are necessary to reflect the physical details of CMP and improve model flexibility to incorporate these key process variables.

The integrated ECD/CMP model must also be extended to multilevel cases. The coupled plating/polishing multilevel simulation capability is needed to help achieve more environmentally benign processes, higher yield and performance, identify mask design weaknesses, and to enable the optimization of layout and dummy filling designs for electronic design automation (EDA).
Many new technologies, such as ECMP, are emerging due to the challenges from the increasingly stringent requirements of 65nm technology and beyond. At the same time, conventional CMP needs to be extended and improved to meet the needs for 65 nm and below. Means to achieve thinner copper film deposition and low down force conventional CMP for Low-K materials are needed. An integrated ECD/CMP model is needed to investigate the viability of this solution, to simulate the dishing and erosion resulting from different combinations of plating and polishing processes, and to enable the co-optimization of the two processes to address environmental and planarity concerns.

An improved plating model, in conjunction with the copper CMP model, can also form the basis for future research into processes which simultaneously accomplish both plating and polishing, such as electrochemically-assisted CMP. The integrated model for electroplating and CMP in multi-level metallization can also serve as the backbone for the long range goal of a complete back-end topography process simulator.

In summary, the main goals and objectives of this thesis are as follows:

1. Develop a chip-scale copper electroplating model that incorporates the effects of additives, feature-scale dependencies, and copper depletion, and can be applied to random layouts.

2. Develop an improved copper CMP model integrating contact wear, pattern density, and step height dependence in order to improve the prediction of dishing and erosion for random layouts.

3. Improve layout parameter extraction procedures and terminology for topographical features, enabling the seamless integration between
electroplating and CMP models in chip-scale simulation and prediction for random layouts.

4. Characterize and validate both metal 1 and metal 2 electroplating and CMP pattern dependent effects, and extend the integrated model into the multilevel cases.

5. Identify the impacts of temperature variation and pattern dependency of other processes on the CMP modeling.

6. Illustrate the co-optimization of electroplating and CMP and demonstrate the improvement in the topography and effective copper thickness from the use of in-pattern dummy fills.

1.5 Thesis Organization

This thesis is organized into six chapters. Chapter 2 summarizes the overall methodology for both electroplating and CMP characterization and modeling. A brief summary of the experiment plan and procedure follows the methodology introduction.

Once the overall methodology and experiment plan are presented, the subsequent chapters discuss in detail the new ECD and CMP models and simulation/validation. In Chapter 3, the time-stepped version of the ECD model is developed by considering the details of additive absorption and desorption, and including the simplification of a successful feature-scale copper electroplating model. The inter-feature copper ion depletion effects and random layout problem, which were ignored in the previous model, are also addressed. Model accuracy and computational efficiency are careful balanced in this model. After reviewing the validation result for the first level case, this time-stepped
model is extended to deal with non-even pre-plating topography case for modeling plating in multilevel copper metallization.

Chapter 4 introduces an improved and coherent chip-scale model framework for copper bulk polishing, copper over-polishing, and barrier layer polishing by seamlessly integrating contact wear and density-step-height models, and by considering the pad mechanical and surface topographical properties. Modeling of the multilevel effects is also addressed. The new CMP model is calibrated to experimental data for an example process, and validation results are presented. Finally, the implications for layout design rules are discussed.

The integrated ECD and CMP model is a powerful tool to optimize multilevel copper metallization and improve layout and dummy fill designs. Chapter 5 presents one application of the integrated model, to simulate and co-optimize the copper plating and CMP processes. To achieve a thinner copper film deposition and to support low-down force conventional CMP, an approach using in-pattern dummy fills is presented.

Finally, Chapter 6 summarizes the major results and contributions of this thesis and discusses potential topics for future research in this area.
Chapter 2

Methodology and Experiment Plan

Before discussing the details of ECD and CMP models, it is necessary to introduce the overall methodology and experiment plan.

A systematic methodology is a must-have in order to develop an integrated model for multilevel copper metallization. Every module has to be carried out under the overall methodology to make sure the process characterization and validation are accurate and effective. The internal connection between the plating and CMP models must also be supported by the high-level framework. While the simulation model receives the most attention, it is not an exaggeration to say that the overall characterization, measurement, modeling, and validation methodology, as pictured in Figure 2-1, is the true cornerstone of our approach.

Based on the methodology, a carefully designed experiment plan is carried out to reach the objectives of this project. In this chapter, the experiment plan is briefly summarized to illustrate the overall methodology, and to provide background information on the data gathered for use in the subsequent development and testing of the ECD and CMP models described in Chapters 3 and 4.

2.1 Methodology

This section first presents the basic methodology to characterize and validate an electroplating process – specifically, to understand the post-plating height variations,
such as field thickness, step height and array height. Then, the basic methodology is applied to a specific copper CMP process to understand the relation between post-CMP topography and layout parameters. The development and application of a pattern dependent electroplating model is coupled to the characterization method. Both models are improved to accommodate random layouts used in the real product manufacturing, so that process simulation and optimization on real applications is possible. In order to enhance the coherent integration of electroplating and CMP modeling and simulations, the interrelated data interfaces are standardized. Although the characterization and validation processes are implemented in the first metal level, both of the basic models are designed to be inherently adjustable to uneven initial topographies. This property makes multilevel extension easy and efficient. After introducing the methodology for the first metal level, the strategy to extend from single level metal to multilevel metal cases is summarized.

Figure 2-1. Methodology for M1 ECD and CMP.
The key problems of dishing and erosion in copper CMP depend on the electroplated topography as well as the CMP process parameters and consumables, and the layout patterns. The electroplated topography is also dependent on the layout pattern. In order to deal with the coupled pattern dependent problems, the integrated characterization and modeling methodology for chip-scale copper electroplating and copper CMP process simulations is developed using a specially designed test mask. The methodology originally developed by Park [23], Tugbawa [49], and others, as shown in Figure 2-1, is applied to both first and second level ECD and CMP processes.

To characterize post-plated copper topography variation, a wafer is patterned with the test mask and electroplated. The topography is measured across both isolated features and many test pattern arrays, each having different layout parameters such as line width, line space, and pattern density. The measurement of relative surface height is typically obtained using a surface profiler. In addition, the absolute copper thickness is measured for translating relative surface height measurements into absolute thickness. The measured data is modeled as semi-empirical or physics-based functions of specific layout features by considering key pattern dependencies. While many of the parameters in the model may be based on known physics, a subset of the model parameters is tuned to the specific results of the plating process in order to capture unknown or complex effects of the specific process. After the model parameters are extracted to calibrate the model, the electroplating pattern dependent model is ready to be used to perform chip-scale simulations for the electroplated surface topography for other chips fabricated using the calibrated electroplating process. The post-plating simulation result can be used as the initial input in the subsequent CMP model.
Characterization and modeling of pattern dependent variations of copper dishing and oxide erosion for the CMP process is similar to that used for the plating process. A patterned test wafer is polished using the particular process. After measuring dishing of copper lines and erosion of oxide as well as dielectric thickness of field areas, a semi-empirical or physics-based pattern dependent CMP model is developed and fit to the experimental data. The calibrated copper CMP model can then be used for chip-level simulation of any arbitrary layout. The prediction of dishing and erosion from the chip-scale simulation can be used for optimization and design of process and other purposes.

As mentioned, the post-plating topography information is transferred to the CMP model as the initial condition of the die surface prior to polishing. In previous work [23], the field thickness, array height and step height are key variables used to represent the plated topography and dishing and erosion for CMP. In order to more seamlessly integrate the modeling of different steps used in multilevel copper metallization, a general data interface is required. Here, the absolute heights of the top and bottom of the patterns are referenced to the top of the barrier layer. The two key variables used here for topography representation are the absolute heights of the top and bottom of the pattern. All variables used in the previous models by Park [23] and Tugbawa [49] can be easily computed from these two general variables. Another significant advantage in using this data interface is that general application to any random layouts which have no clear defined field or open areas is possible.

The overall process flow for modeling and simulation of multilevel copper metallization is shown in Figure 2-2. Here, we see that M1 post-CMP topography is first simulated with M1 layout extraction and calibrated ECD/CMP models; then the
topography is transferred into an M2 pre-plating topography by assuming that the interlevel dielectric (ILD) deposition is conformal; then M2 post-plating and post-CMP topographies are simulated with M2 layout extraction, M2 pre-plating topography and calibrated ECD/CMP models. These results will be checked with the measured M2 topography data to validate the ECD/CMP model calibrated using M1 data.

Figure 2-2. Process flow for modeling and simulation of multilevel copper metallization.

2.2 Experimental Plan

The experimental plan based on the overall methodology is summarized in this section. First, we focus on the calibration process runs for ECD and CMP, including M1 and M2 test wafer fabrication. Second, we discuss the measurement plan corresponding to these process runs.

The design of experiments for electroplating is more straightforward than that for the CMP process. In order to realize the desire to model and track the plating profile evolution with time, several electroplated wafers with different target copper thicknesses
are measured to obtain the topographical data for model parameter extraction. The details of the DOE are shown in Table 2-1. The target thicknesses are evenly distributed in time. The measured data show slight discrepancy due to process control problems. Splits 2 to 4 are used in time-stepped ECD model calibration.

Table 2-1 Design of experiment for time-stepped ECD model.

<table>
<thead>
<tr>
<th># of Split</th>
<th># of Wafers</th>
<th>Cu Seed Layer Thickness (µm)</th>
<th>Electroplated Cu Thickness (µm)</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>2</td>
<td>0.15</td>
<td>~0.1</td>
</tr>
<tr>
<td>2</td>
<td>2</td>
<td>0.15</td>
<td>~0.7</td>
</tr>
<tr>
<td>3</td>
<td>2</td>
<td>0.15</td>
<td>~0.9</td>
</tr>
<tr>
<td>4</td>
<td>2</td>
<td>0.15</td>
<td>~1.0</td>
</tr>
<tr>
<td>5</td>
<td>2</td>
<td>0.15</td>
<td>~1.3</td>
</tr>
</tbody>
</table>

For the CMP design of experiments, M1 patterned wafers and blanket wafers are used to extract the parameters of the improved CMP model. Oxide, copper and barrier blanket removal rates with respect to different slurries (copper slurry and barrier slurry) can be directly measured from polishing the corresponding blanket wafers. In order to capture the time-dependence of copper removal rate, a series of blanket wafers are polished for different times. Bulk copper partial polish and over-polishing using the copper slurry, and barrier polishing using the barrier slurry, are applied to the patterned test wafers. The relatively small dishing and erosion after standard M1 metallization, while desirable from a manufacturing perspective, makes model calibration difficult. In the subsequent M2 experiments, the M1 CMP is artificially designed to enlarge the unevenness of the topography for the 25 wafers used for M2 polish experiments. The M2 CMP uses the standard process conditions (pressure, slurry, etc.) as in the previous single level experiment. The details of the DOE are shown in Table 2-2 and Table 2-3.
Table 2-2. Design of experiment for M1 ECD and CMP

<table>
<thead>
<tr>
<th># of Split</th>
<th># of Wafers</th>
<th>Slurry</th>
<th>Time (Sec)</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Bulk Copper Partial Polishing (Patterned Wafer)</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>2</td>
<td></td>
<td>0</td>
<td>ECP/CMP</td>
</tr>
<tr>
<td>2</td>
<td>2</td>
<td>1</td>
<td>10</td>
<td>CMP</td>
</tr>
<tr>
<td>3</td>
<td>2</td>
<td>1</td>
<td>20</td>
<td>CMP</td>
</tr>
<tr>
<td>4</td>
<td>2</td>
<td>1</td>
<td>30</td>
<td>CMP</td>
</tr>
<tr>
<td>5</td>
<td>2</td>
<td>1</td>
<td>40</td>
<td>CMP</td>
</tr>
<tr>
<td>6</td>
<td>2</td>
<td>1</td>
<td>50</td>
<td>CMP</td>
</tr>
<tr>
<td><strong>Bulk Copper Over-polishing (Patterned Wafer)</strong></td>
<td></td>
<td></td>
<td>100 (EP)</td>
<td>CMP</td>
</tr>
<tr>
<td>7</td>
<td>2</td>
<td>1</td>
<td></td>
<td>CMP</td>
</tr>
<tr>
<td>8</td>
<td>2</td>
<td>1</td>
<td>130</td>
<td>CMP</td>
</tr>
<tr>
<td>9</td>
<td>2</td>
<td>1</td>
<td>160</td>
<td>CMP</td>
</tr>
<tr>
<td><strong>Barrier Polishing (Patterned Wafer)</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>2</td>
<td>1/2</td>
<td>EP/30</td>
<td>CMP</td>
</tr>
<tr>
<td>11</td>
<td>2</td>
<td>1/2</td>
<td>EP/50 (Standard)</td>
<td>CMP</td>
</tr>
<tr>
<td>12</td>
<td>2</td>
<td>1/2</td>
<td>EP/70</td>
<td>CMP</td>
</tr>
<tr>
<td>13</td>
<td>2</td>
<td>1/2</td>
<td>EP/110</td>
<td>CMP</td>
</tr>
<tr>
<td><strong>Copper Polishing (Blanket Wafer)</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>14</td>
<td>2</td>
<td>1</td>
<td>10</td>
<td>CMP</td>
</tr>
<tr>
<td>15</td>
<td>2</td>
<td>1</td>
<td>20</td>
<td>CMP</td>
</tr>
<tr>
<td>16</td>
<td>2</td>
<td>1</td>
<td>30</td>
<td>CMP</td>
</tr>
<tr>
<td>17</td>
<td>2</td>
<td>1</td>
<td>40</td>
<td>CMP</td>
</tr>
<tr>
<td>18</td>
<td>2</td>
<td>1</td>
<td>50</td>
<td>CMP</td>
</tr>
<tr>
<td><strong>Copper Polishing (Blanket Wafer)</strong></td>
<td></td>
<td></td>
<td>50</td>
<td>CMP</td>
</tr>
<tr>
<td>19</td>
<td>1</td>
<td>2</td>
<td></td>
<td>CMP</td>
</tr>
<tr>
<td>20</td>
<td>2</td>
<td>2</td>
<td>110</td>
<td>CMP</td>
</tr>
<tr>
<td><strong>Oxide Polishing (Blanket Wafer)</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>21</td>
<td>2</td>
<td>1</td>
<td>130</td>
<td>CMP</td>
</tr>
<tr>
<td><strong>Oxide Polishing (Blanket Wafer)</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>22</td>
<td>1</td>
<td>2</td>
<td>50</td>
<td>CMP</td>
</tr>
<tr>
<td>23</td>
<td>2</td>
<td>2</td>
<td>110</td>
<td>CMP</td>
</tr>
<tr>
<td><strong>Barrier Polishing (Blanket Wafer)</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>24</td>
<td>2</td>
<td>1</td>
<td>130</td>
<td>CMP</td>
</tr>
<tr>
<td><strong>Barrier Polishing (Blanket Wafer)</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>25</td>
<td>2</td>
<td>2</td>
<td>30</td>
<td>CMP</td>
</tr>
<tr>
<td>26</td>
<td>2</td>
<td>2</td>
<td>50</td>
<td>CMP</td>
</tr>
<tr>
<td>27</td>
<td>2</td>
<td>2</td>
<td>70</td>
<td>CMP</td>
</tr>
<tr>
<td>28</td>
<td>2</td>
<td>2</td>
<td>110</td>
<td>CMP</td>
</tr>
</tbody>
</table>

Table notes:
1. Wafers are to be polished in partial time increments so that there are at least three partially polished wafers with patterns visible for surface profile measurement. Copper blanket monitor wafers are polished after every two patterned wafers polish to obtain removal rate information especially as a function of time. Thus, one patterned wafer, then one blanket wafer for the same
time; then next patterned wafer, then blanket wafer for the same polishing time, and so on for another polishing time.

2. EP = Standard Polishing Time of Slurry 1 (Copper)
3. Every wafer is polished with copper slurry for the standard time. The standard time with barrier slurry is about 60 sec.

Table 2-3. Design of experiment for M2 ECD and CMP.

<table>
<thead>
<tr>
<th># of Split</th>
<th># of Wafers</th>
<th>Slurry</th>
<th>Time (Sec)</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td></td>
<td></td>
<td>M1</td>
</tr>
<tr>
<td>2</td>
<td>2</td>
<td>1/2</td>
<td>0</td>
<td>M1 CMP</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>M1 CMP + M2 ECD</td>
</tr>
<tr>
<td>3</td>
<td>1</td>
<td>1/2</td>
<td>20</td>
<td>M1 CMP + M2 ECD + M2 CMP Step 1</td>
</tr>
<tr>
<td>4</td>
<td>2</td>
<td>1</td>
<td>30</td>
<td>M1 CMP + M2 ECD + M2 CMP Step 1</td>
</tr>
<tr>
<td>5</td>
<td>2</td>
<td>1</td>
<td>40</td>
<td>M1 CMP + M2 ECD + M2 CMP Step 1</td>
</tr>
<tr>
<td>6</td>
<td>2</td>
<td>1</td>
<td>50</td>
<td>M1 CMP + M2 ECD + M2 CMP Step 1</td>
</tr>
<tr>
<td>7</td>
<td>2</td>
<td>1</td>
<td>100 (EP)</td>
<td>M1 CMP + M2 ECD + M2 CMP Step 1</td>
</tr>
<tr>
<td>8</td>
<td>2</td>
<td>1</td>
<td>130</td>
<td>M1 CMP + M2 ECD + M2 CMP Step 1</td>
</tr>
<tr>
<td>9</td>
<td>2</td>
<td>1</td>
<td>160</td>
<td>M1 CMP + M2 ECD + M2 CMP Step 1</td>
</tr>
<tr>
<td>10</td>
<td>1</td>
<td>1</td>
<td>200</td>
<td>M1 CMP + M2 ECD + M2 CMP Step 1</td>
</tr>
</tbody>
</table>

2.3 Measurement Plan

Measurement is a critical and tedious step in the experiments. The necessary measurements include a number of pre-CMP and post-CMP measurements. Pre-CMP copper thickness is measured at several locations and profiles for the specified arrays within the die for the electroplated patterned wafers. The post-CMP copper thickness at several locations and profiles for the specified arrays within the die are needed, both for partially polished wafers (where the copper still covers all or most of the structures), and for wafers polished through to clearing, over-polish, and barrier removal. Pre-CMP and
post-CMP copper thickness, barrier thickness, and dielectric thickness across blanket wafers are also measured. The copper thicknesses on patterned wafers are measured using the Impulse tool at Philips Analytical. A profiler, such as the KLA/Tencor high resolution profilometer (HRP), is used to measure the surface height topography from ECD and CMP. Atomic force microscopy (AFM) is a supplemental tool used to measure copper dishing, particularly in very fine features. The locations of profilometry and thickness measurements for M1 are shown in Figure 2-3. For the M2 mask layer, the measurement plan is similar.

![Figure 2-3. Measurement plan for electroplating and CMP: MIT 854 mask, level 1.](image)
2.4 Summary

This research adopts a semi-empirical characterization and modeling methodology. Test wafers having a wide variety of layout patterns are used to gather detailed information on pattern dependencies in specific ECD and CMP processes. The plating and polishing models developed later in this thesis are then tuned to the specific processes. By incorporating key physical dependencies, the models should then be able to predict the topography across the entire chip for new and different product layouts.

An experimental plan is presented which provides for multiple time-slices in the plating of patterned wafers, to support the development of a time-evolution ECD model. The plan also generates multiple partial polish time-slices for both single level M1 metal layer, and multiple-level M2 metal layers, to support the development and testing of an improved multilevel CMP model. Measurements of a wide variety of line width, line space, pattern density, and isolated features using surface profilometry and thin film thickness tools provide surface height and thickness information to tune and test the coupled ECD/CMP models described in Chapters 3 and 4.
Chapter 3

Time-Stepped Chip-Scale ECD Model

As discussed in the previous chapters, the copper ECD process has been found to strongly depend on the chip layout and patterned features, just as copper CMP does. Not only does the post-electroplating topography affect the subsequent CMP process, but also the post-CMP height variation can potentially impact the higher level electroplating processes. Thus a chip-scale multilevel ECD model with reasonable accuracy and computational load is required for accurate CMP simulation, and this model must be seamlessly integrated with chip-scale CMP modeling in order to accurately predict topography in multilevel copper metallization. The CMP model will be described in detail in Chapter 4. Here, we focus on the development and testing of a new ECD model with improved accuracy, incorporating physical plating dependencies, and that also addresses pre-electroplating topography effects.

In Section 3.1, the previous related work is briefly summarized. The experimental data and measurements obtained for ECD model development are presented in Section 3.2. In Section 3.3, the terminology and framework of the new ECD model is presented, and the new model is developed in Section 3.4. Calibration results for the new model using the experimental data are presented in Section 3.5. Model verification results from another MagnaChip mask showing the validation of this model are presented in Section 3.6. In Section 3.7, metal 2 (M2) experimental data and modeling extensions are discussed. Finally, Section 3.8 summarizes the chapter.
A new time-stepped chip-scale ECD model is developed in Section 3.2 to address the copper ion depletion effect and the two-dimensional geometries found in arbitrary layouts, and to extend the model in various process conditions and multilevel copper metallization case. Section 3.3 describes SOMETHING, Section 3.4 discusses SOMETHING MORE, and finally Section 3.5 summarizes the chapter.

3.1 Previous ECD Model

In Chapter 1, the basic mechanism of ECD and the key role of different additives were introduced. Here, a semi-empirical chip-scale ECD model developed by Park [23] is reviewed. This model and methodology is the starting point for the new model. The overall methodology for the development and application of the pattern-dependent electroplating model is similar to that used for CMP simulation, as discussed in Chapter 2.

The line width and line space are the layout key factors which affect the post-electroplated topography. Park uses two quantities (step height and array height), as shown in Figure 6-1, to describe the post-plating surface topography. Step height is defined as the height of the copper surface over the bottom of each copper line (or trench) with respect to the nearby copper surface over the oxide. If the copper line overfills, the value of the step height ($SH$) is positive. In most cases, the copper line recesses and the value will be defined as negative. Array height, $AH$, is defined as the top surface height of copper over an array region with respect to the flat copper field region over a wide oxide area [23]. A similar method to define step height with positive and negative assignments is applied to array height. So, the array height of the field area is defined to be at zero [23]. Thus, for fine line and fine space regions, a positive $AH$ is used to
describe a bulge or excess of copper fill that often forms in plating above such regions. In contrast, a recessed area has a negative $AH$ as shown in Figure 3-1.

![Figure 3-1. Step height and array height definitions in ECD [23].](image)

Rather than being created using a first principles physics model, Park’s electroplating pattern dependency model is generated using multivariate response surface regressions to capture the post-plating surface height variation as a function of underlying layout parameters. This approach can effectively achieve the goal of chip-scale simulation for a real product layout with the aid of a layout pattern extraction tool. In the model, the basic electroplating and superfilling mechanisms have been considered to help physically motivate the model variables and guide the structure of the response surface equation. For example, a $1/(\text{line width})$ model term is included, and is used to describe the superfilling effect on the deposition rate and thus on the final surface profile.

Park uses the following response model to capture $AH$ and $SH$ variations:

$$AH = a_W W + b_W W^{-1} + c_W W^{-2} + d_S S + e_A (WS) + \text{Const}_A$$  \hspace{1cm} (3-1)

or

$$AH = a_W W + c_W W^{-2} + d_S S + e_A (WS) + \text{Const}_A$$  \hspace{1cm} (3-2)
\[ SH = a_3 W + b_3 W^{-1} + c_3 W^{-2} + d_s S + e_s (WS) + Const_s \]  

where \( SH \) is step height and \( AH \) is array height as defined earlier. \( W \) and \( S \) are line width and line space, respectively, and \( a_4 \), \( b_4 \), \( c_4 \), \( d_4 \), \( e_4 \), \( Const_4 \), \( a_s \), \( b_s \), \( c_s \), \( d_s \), \( e_s \), and \( Const_s \) are model coefficients. For small features, the \( W^{-1} \) and \( W^{-2} \) terms are dominant and capture superfilling effects. However, the \( W \), \( S \), and \( W \cdot S \) terms take on major roles, and conformal fill is dominant, as the width of the line increases. The selection or inclusion of particular terms in the model are not completely fixed, and vary with specific ECD processes. Individual tuning of the terms has to be implemented to capture the \( AH \) and \( SH \) trends with respect to \( W \).

In order to carry out chip-scale modeling, the layout is discretized into \( 40 \times 40 \) \( \mu \)m grid cells and generalized average \( SH \) and \( AH \) are expressed as an area weighted average height variation across the binned layout information. Beside \( AH \) and \( SH \), the copper thickness variation of the wide field areas without any patterns has to be considered, due to the systematic non-negligible experimental results showing nonuniform field thicknesses across the chip. The added field thickness variation modeling is incorporated in the chip-scale simulation. The following model form is used to simulate the field thickness variation:

\[ FT_g = \alpha (\overline{SA_g} - C_o) + F_o \]  

where \( C_o \), \( \alpha \), and \( F_o \) are fitted model parameters, \( FT_g \) is the field thickness for the grid cell, \( \overline{SA_g} \) is an average value of the underlying cell surface area for grid cell \( SA_g \), where the average is taken over a substantial number of nearby cells. All of the nearby cells within a characteristic length \( P \) are assumed to contribute some effect on \( FT_g \). However, the physical origins of this effect are not clearly explained in this empirical model. The
calculation of $S A_g$ is shown in Figure 3-2, where $G$ is grid size, $A_S$ and $A_B$ are the areas of trench top (or trench surface) and bottom respectively, $H_0$ is trench depth, $L$ is wire length, and $N$ is number of patterns in this grid cell. We note that $A_W$ as expressed in the calculation turns out to be the product of the perimeter of all patterns in the grid cell and trench depth.

![Figure 3-2. Surface area calculation [23].](image)

With a specially designed test mask and a methodology for chip-scale ECD surface topography modeling, the semi-empirical response surface model is extracted by Park to capture the pattern dependency in a relatively accurate and computationally-efficient way. However, there are several limitations in this framework. Significant efforts are required to improve this model for use in simulating multilevel copper metallization.

First of all, although this model is physically motivated, traditional multivariate fittings are used. The basic physical meanings of the model parameters are not fully understandable and meaningful. For every process, a specific model with different model terms has to be used to fit the data. One calibrated model cannot be applied to another process with different process time or different trench depth. These are important variables that can be helpful in exploring and optimizing the process. If they are included explicitly in a model, rather than implicitly be lumped into the empirical model.
parameters, the improved model can be used to predict post-plating topography as a function of time and trench depth, parameters that are likely to change more frequently over the lifetime of an ECD process. In addition, in the dual damascene process, the presence of vias is some regions below the metal lines conflicts with the assumption of a single uniform trench depth across the entire chip. In the more general case of multilevel copper metallization, the underlying surface variation might have some impact on the ECD processes.

A second concern arises from the use of more than ten model parameters which are fitted in Park’s model, which can result in potential over-fitting. In some cases, for example, a slight change in the data can substantially change the model fitting parameters. This problem is inherent in the largely empirical nature of the model.

A third limitation in the model is the difficulty in handling an arbitrary layout, due to limitation in the layout extraction. Extracted values for line width and line space alone within each cell cannot express all of the important layout information. In addition, the definition of line space can be difficult to apply in general two-dimensional patterns that include more complicated structures than simple arrays of long lines. In order to capture the two-dimensional effects in real product layouts, new variables with more clear definitions need to be used in the model. The last concern is that this model does not attempt to explain the field thickness height variation physically, and neglects so-called “edge effects” or “corner effects” in the ECD process observed in this thesis project.

The time-stepped chip-scale ECD model developed in the rest of this chapter aims at solving these major concerns. At the same time, the new seeks to retain the advantages of relatively high accuracy and low computational load. In addition, the model structure and
implementation must be compatible with the CMP chip-scale model to enable ECD/CMP model integration.

### 3.2 Experimental Data and Measurements for Model Calibration

In this section, we introduce the experiment results from HRP scans, copper absolute thickness measurements, and SEM (scanning electron microscopic) cross sections. These results correspond to the experimental and measurement plans for metal 1 described in Chapter 2. We first discuss measurements of the copper thickness in field regions, followed by presentation of surface profilometry scans over arrays and other patterned features. Finally, we discuss observation of pronounced edge effects in these scans, and relate these to the proposed ion depletion effect included in the new ECD model.

#### 3.2.1 Field Area Measurements

A KLA-Tencor high resolution profilometry (HRP) tool is used to scan various structures and arrays of MIT/SEMATECH 854 M1 wafers with different electroplated copper thicknesses to construct the relative surface topography. The Philips Analytical Impulse 300 is an optoacoustic film thickness non-destructive metrology tool for characterization of copper ECD and CMP [50]. The ability to directly measure copper film thickness with high spatial resolution and accuracy can provide complementary data to the HRP [49]. The absolute copper thickness data from the Impulse is combined with the relative height variation data, and defines the absolute topography we need in this research. This tool can measure the copper thickness in large field areas without patterns, as well as the copper thickness within copper lines if the laser spot size is less than the trench width. After very careful calibration, the tool can also measure the copper thickness over fine line arrays. Figure 3-3 shows the 41-site copper thickness
measurement data for five MIT/SEMATECH 854 M1 electroplated wafers. Figure 3-4 shows the positions of the 41 sites for these M1 copper thickness measurements. These unpolished copper patterned wafers with different film thickness are used to calibrate our time-stepped ECD model. Wafers CPT 115-01 and CPT 115-07 are representative of very thin plated film and very thick copper film cases, respectively. The copper thickness of other wafers is in the range of ordinary process conditions. The total copper thickness is equal to the sum of the copper seed layer thickness (1500 Å) and the electroplated copper thickness. One observation from Figure 3-3 is that the height variation is formed in the initial stage of electroplating (as seen in the CPT 115-01 wafer plated for the shortest time), and the variation remains (does not scale or increase in magnitude), even as the overall copper plating thickness increases with longer plating times (as seen in the other wafers in Figure 3-3). This result can be explained by copper ion depletion effects and the proposed ECD model, as discussed later.

Figure 3-3. 41-site total copper thickness measurements for MIT/SEMATECH 854 M1 wafers.
3.2.2 Line/Space Array Measurements

Due to spatial resolution limits of the HRP (whereby very narrow deep trenches cannot be accurately measured), SEM cross-sectional images also are used to provide complementary information for the surface topography. Figure 3-5 shows the measurement plan for HRP scans of the MIT/SEMATECH M1 electroplated wafers. The surface topography data extracted from the array scans will be used in model calibration. However, the long HRP scans can also be used to verify the calibrated model, as these encompass a substantial range of different line widths and line spaces within a single scan. The numbers on the arrays in Figure 3-5 show the corresponding line width ($W$) and line space ($S$), expressed as $W/L$. The bus structures, shown in more detail in Figure 3-6, have significantly different layout characteristics compared to the array structures. The
varied numbers of lines in this region provide a powerful method to validate the extracted model, and help confirm the applicability of the model to an arbitrary mask with such complex patterns.

Figure 3-5. Array scans and long scans for M1 HRP measurements.

Figure 3-6. Bus structure of MIT/SEMATECH 854 M1 mask.

The following ten sets of plots, in Figures 3-7 through 3-16, show the topography data from array HRP scans and long HRP scans. Together, these scans explore a wide range of line width, line space, and pattern density combinations which typically appears
in the different metal layers used in product chips. It should be noted that most product chips would not have the full range of these feature sizes and combinations on any single metal level. As discussed in Chapter 1, lower level metal tends to have smaller feature sizes for local interconnect, while the upper level metal layers tend to have much larger feature sizes for global interconnect, as well as power distribution. Thus, the MIT/SEMATECH 854 M1 mask, while labeled a “metal 1” mask, is intended to exercise and gather data on the pattern dependencies in CMP processes for the full range of feature sizes.
Figure 3-8. Post-plating surface HRP array scans for CPT 104-01 (X: μm, Y: Å).
Figure 3-9. Post-plating surface HRP array scans for CPT 115-04 (X: μm, Y: Å).
Figure 3-10. Post-plating surface HRP array scans for CPT 105-01 (X: µm, Y: Å).
Figure 3-11. Post-plating surface HRP array scans for CPT 115-07 (X: μm, Y: Å).
Figure 3-12. Post-plating surface HRP long scans for CPT 115-01 (X: µm, Y: Å).

Figure 3-13. Post-plating surface HRP long scans for CPT 104-01 (X: µm, Y: Å).
Figure 3-14. Post-plating surface HRP long scans for CPT 115-04 (X: µm, Y: Å).

Figure 3-15. Post-plating surface HRP long scans for CPT 105-01 (X: µm, Y: Å).
Figure 3-16. Post-plating surface HRP long scans for CPT 115-07 (X: μm, Y: Å).

Figure 3-6 shows that the HRP-measured step heights for 1 μm lines with different line spaces are very similar to each other. A possible reason is that the tip of the HRP cannot fully reach to the bottom of deep trenches for the 1 μm or smaller lines. Abrokwah [51] discussed similar limits of HRP scans in the context of etch pattern dependency research. Although the 1 μm trenches in some of the thin plating cases above have not fully filled and have substantial step heights, other smaller lines with 0.5, 0.25 and 0.18 μm line width are overfilled and show virtually no step height. The HRP results clearly demonstrate that overfilling progresses from the smallest features to micron-size ones in a sequential time order, and the time to overfill the sub-micron features is very short. After this short initial stage, the uneven copper deposition process will smoothly reach a pseudo-conformal deposition state. The simulation results in the later section will further show this transition. One potential strategy suggested by this phenomenon is to
deposit thinner copper layers and use a CMP process with better planarization capability to remove the thinner overburden copper layer. Such a process integration can improve process productivity, lower energy consumption and reduce the environmental waste load. The copper deposition thickness is thus limited by how effective the CMP process is in planarizing topography in thin layers, further motivating the need for the integrated ECD/CMP simulation capability.

One problem in the long scans of the bus structures is to pinpoint the scan locations when using HRP. As seen in the last figure (Figure 3-16), the scan start locations can vary by tens of microns due to variation in wafer positioning, and in some cases the scans do not intersect with the desired structures.

3.2.3 Edge Effects in Array Scans

The HRP scans for fine line arrays, especially at 0.18 and 0.25 μm line width, in the last set of figures clearly and consistently show a positive spike on the edge of fine line arrays, and a negative spike on the edge of field regions. This observation holds for most or all of the dies, across multiple different electroplated wafers, and is not a measurement artifact. We suggest that a copper ion depletion effect, due to limited mass transportation on the inter-feature scale, can be the reason for this topography. Similar observations can be found in other published plating processes. Figure 3-17 shows the HRP scans in Park’s thesis [23] where we see similar effects; although the degree or amplitude of the edge effect is significantly weaker than what we see in our experimental MagnaChip ECD process. The details of the model for the copper depletion effect will be discussed in the later sections. We can, however, exclude a number of other potential reasons for this effect, and provide an intuitive understanding of the effect here. Additive depletion,
especially of a leveler, would act to eliminate these protrusions rather than create such spikes. We also find that the copper thickness of the field regions varies depending on the line width and pattern density of the nearby features. The effective total surface area, including the trench walls, is proportional to additive absorption and the overfilling effects. The strong overfilling is believed to deplete the effective copper ion concentration on a millimeter lateral length scale and affect the copper deposition on the nearby field regions. The copper thickness variation of the field regions mentioned in Park’s thesis can be explained in this way, resulting in his negative fitted value of α.

Figure 3-17. Surface topography profiles for fine pitch structures, in the electroplating process analyzed by Park [26].

The copper ion depletion theory predicts a weaker edge effect for the edge scan (near the upper or lower edge of the array) on the fine line array than for the center scan (taken along the midpoint of the array). Figure 3-5 identifies the edge scan locations. Figure 3-18 shows the additional edge scans obtained in order to further confirm the theory. The results support the prediction, as seen by comparing the edge and center scans. Another
observation is that the overfilling effects and thus edge effects decrease with the line width for the sub-micron patterns. This result matches well with the copper ion depletion effect.

Figure 3-18. Center (left) and edge (right) HRP scans for CPT 104-01 (X: µm, Y: Å).

In order to exclude the possibility of the measurements being due to artifacts from the HRP scans, and to confirm the asymmetry of the edge effects, the Philips Analytical Impulse tool is used to intensively measure the copper thickness above the fine line arrays in multiple dies and multiple wafers, as shown in Figure 3-19. The absolute copper thickness measurements match with the HRP scans with good accuracy. An interesting observation from this test is that the directions of the asymmetry of the edge effect in the top and bottom dies, and the left and right dies, are opposite. The centrifugal force from wafer rotation or convection-dependent adsorption of additives [52] could be the primary reason for this consistent asymmetry of the edge effect. A similar asymmetry can also be found in the ECD process analyzed by Park, as shown in Figure 3-17.
(a) Die maps

(b) Copper thickness above 0.18 μm line array

(c) Cu thickness above 0.25 μm line array

Figure 3-19. Copper thickness above 0.18 and 0.25 μm line arrays for multiple dies and multiple wafers (courtesy Philips Analytical).
3.3 Terminology

Before presenting the details of the ECD model, the definitions of some key variables used in this research need to be clarified first.

3.3.1 Topography Variables

Topography variables characterize the surface height variation in a systematic way. As shown in Figure 3-20, the top of the barrier layer is defined as zero in height. The average height of the surface over the trench top in a grid cell \( T_{Top} \) and the average height of the surface over the trench bottom in a grid cell \( T_{Bottom} \) are two generalized variables used here to characterize the topography. The field region is considered as a special case having no patterns, and the height of the field regions, \( T_{Field} \), is equal to \( T_{Top} \) and \( T_{Bottom} \) in this framework. In this way, no separate variable is required to describe or characterize the field regions. These two generalized variables, \( T_{Top} \) and \( T_{Bottom} \), can also handle an arbitrary mask which does not have any clearly defined field regions.

The (effective) step height, \( S_{eff} \), is equal to \( T_{Top} - T_{Bottom} \). So, the sign of step height for the non-overfilling case is positive, and negative for the overfilling case. The array height \( A \) used in this thesis is the height difference between the (top) surface of the nearby field regions and the top surface of the array for the non-overfilling case, or between the (top) surface of the nearby field regions and the bottom surface of the array for the overfilling case. Thus, the sign of the array height is positive, opposite to that in Park’s definitions. These sign conventions are to be consistent with the definition of step height and dishing as used in CMP modeling. As shown in Figure 3-20, the (effective) step height, \( S_{eff} \), is not equal to the step height from the highest point to the lowest point in the pattern areas. Instead, the trench width changes in the post-plating topography are
lumped into the $S_{\text{eff}}$. A major advantage of this approach is to standardize and simplify the data extraction from the HRP scan data. Another simplification for data extraction is that the step height for overfilling cases is set to zero, and thus $T_{\text{Top}}$ and $T_{\text{Bottom}}$ are defined as the overall average height of the surface over a grid cell, including the trench top and bottom. Based on the SEM images, the post-plating surface for the overfilling case is quite complex and cannot be described as a simple rectangular shape. The high spatial frequency height variation might be thought to pose a significant challenge to chip-scale ECD modeling. Fortunately, the details of this kind of height variation can be ignored, since the primary aim of the chip-scale ECD model is to provide topography information for the following CMP steps. Thus, an approximation that preserves the volume of excess plated copper in such cases is adequate. The ECD model introduced in the following sections will correspondingly ignore the small surface variations in the sub-micron features having overfill.

Figure 3-20. Topography variables.
Figure 3-21 illustrates the envelope and step height extraction from the HRP scans for the arrays with fine line width and line space. Here, the envelope is defined as the average of the raised part of the copper surface within each grid cell minus the average value the field copper thickness. The step height corresponds to the value $S$ as defined in Figure 3-20. If the size of line width or line space is larger than the grid size (20µm) and the local pattern density is 1 or 0, the step height will be set at zero and the envelope corresponds to the copper height.

![Figure 3-21. Envelope and step height illustration from HRP scan.](image)

### 3.3.2 Layout Variables

Figure 3-22 illustrates the layout extraction process. First, the GDS file of the layout is discretized into $20 \times 20$ µm squares and then the layout extractor computes the layout parameters in each grid cell across the whole layout. The major layout parameters extracted are pattern density, (average) line width, line length, number of objects ($N$), and total perimeter ($P$) in the cell. The (local) pattern density is defined to be the ratio of the pattern area to the grid cell area. All features in a cell can be simplified into rectangular
objects with width $W_1$ and length $W_2$ and $W_2 > W_1$. The average line width is the mean value of $W_1$ in a cell.

![Sample grid cells](image)

Mask discretized into 20$\mu$m x 20$\mu$m cells

Layout extractor is used to compute layout parameters in each cell across the whole chip.

All features in a cell are treated as rectangular objects with $W_1$ and $W_2$ (here, $W_2 > W_1$)

Figure 3-22. Layout variables extraction.

There are a number of reasons for setting the grid size in this research to 20 $\mu$m. For feature sizes larger than 10 $\mu$m, the copper film growth in electroplating is close to conformal. The average asperity size in the CMP pad is 10 to 20 $\mu$m. For a feature with tens-of-micron line width, almost all asperities can contribute to the polishing process on the trench bottom. However, the polishing behavior for several-micron sized features is different due to the so-called asperity filter effect, which will be discussed in Chapter 4 focusing on CMP modeling. Considering the properties of both the ECD and CMP process, 10 to 40 $\mu$m should be a good range for the grid cell size. It is critical to use one parameter to characterize the polishing surface topography in multi-level copper metallization for modeling simplification. The small grid size will improve the accuracy when using one parameter to approximate the surface topography, considering that dishing is generally quite limited for small features. However, the computation cost increases rapidly with the grid cell number. Balancing the modeling accuracy and computation cost, 20 $\mu$m is a good choice for the chip-scale modeling on an ordinary PC.
With larger memory and faster CPUs, it is possible to choose 10 \( \mu \text{m} \) as the grid size for a chip with reasonable area (typically 25 by 25 mm for larger chips).

Another important layout variable, effective pattern density, can be calculated from the local pattern density by averaging over a certain area characterized by the so called \textit{planarization length} around the grid cell. The concept is very useful in CMP modeling and also will be used in the following ECD model. The basic idea behind the use of effective pattern density is to account for the contribution from neighboring regions. Usually, the weighting function used to calculate the effective pattern density is called an averaging filter. Five popular filters are shown in Figure 3-23 [53]. The Gaussian filter is one of the best weighting functions and is widely used in this research.

![Figure 3-23. Different weighting functions for calculation of effective pattern density [53].](image)

3.4 \textit{ECD Model Framework}

In this section, we first review the major relevant feature-scale copper electroplating model available in the literature. The new chip-scale copper ECD model is then presented in detail. Simulation results using the new ECD model are presented in Section 3.5.
3.4.1 Feature-Scale Model Review

Due to the limit of the traditional diffusion-adsorption mechanism in explaining the “bottom-up” effects of multi-component additive recipe in advanced copper electroplating processes, new mechanisms have been developed to address this interesting phenomena and guide the process optimization and development for ever-stringent requirement in semiconductor industry. A.C. West and T.P. Moffat are the top researcher in this respect. Their models can explain the “bottom-up” effects and predict the trench filling process by considering the additive surface coverage change due to the surface accumulation, and the completive adsorption and desorption among these additives, such as accelerator, suppressor and leveler. Although the basic ideas and comparison in their models are discussed in the first chapter briefly, the details, especially the model formalism, of these two types of the models have been further discussed by comparing the similarity and difference. After reviewing the formalism of these models, the basic framework for this research will be introduced in the next section. The model used in this project borrows a lot of concepts and ideas from these two types of feature-scale models and integrated them into a new framework for the chip-scale modeling. West’s work will be introduced first followed by the summary for Moffat’s models.

The model proposed by West uses an accelerator-accumulation model to explain the two-component (accelerator + suppressor) system with bump formation [12]. The surface coverage accumulation of the accelerator, SPS, in the trench bottom causes a temporary decrease in the coverage of the suppressor, PEG, due to absorption and desorption kinetics. The competition for available additive absorption sits on the surface is the key in
this model. The lower surface coverage of suppressor will decrease the inhibition effects, and thus increases the copper deposition rate and cause the bump formation in the submicron features. There are three major assumptions in the model [14]: the SPS coverage increases with the decrease of trench bottom surface area during deposition process; the increase in SPS coverage lowers the coverage of PEG; the concentration variations for all additives and cupric ion are ignored.

The surface coverage of SPS kinetics is expressed as the following equation:

\[ \frac{d(A\theta_{SPS})}{dt} = -k_1 A(\theta_{SPS} - \theta_{SPS,eq}) \]  \hspace{1cm} (3-5)

where \( A \) is local surface area, \( \theta_{SPS} \) and \( \theta_{SPS,eq} \) are SPS surface coverage at time \( t \) and equilibrium, \( k_1 \) is a rate constant. The local current density equation, which decide the copper deposition rate is

\[ i = i_{m}(1 - \theta_{PEG}) \]  \hspace{1cm} (3-6)

where \( i_{m} \) is the current without PEG present. Apparently, it is assumed that the surface coverage of SPS has no direct impact to the copper deposition rate except changing the PEG coverage. The kinetic of PEG has no connection to the surface change as SPS and can be describe as following equation

\[ \frac{d(\theta_{PEG})}{dt} = -k_2 A(\theta_{PEG} - K(1 - \theta_{PEG})) \]  \hspace{1cm} (3-7)

where \( K \) is fitted into a function of \( \theta_{SPS} \)

\[ K = 30 \exp(-7\sqrt{\theta_{SPS}}) \]  \hspace{1cm} (3-8)

The initial surface coverage at \( t=0 \) of SPS and PEG are \( \theta_{SPS,eq} \) and \( \theta_{PEG,eq} \) respectively, and

\[ \theta_{PEG,eq} = \frac{K(\theta_{SPS,eq})}{1 + K(\theta_{SPS,eq})} \]  \hspace{1cm} (3-9)
The values of the fitted kinetic parameters, $k_l$, $k_2$ and $\theta_{PEG, eq}$, are 1s$^{-1}$, 0.005s$^{-1}$ and 0.05 respectively. The equilibrium current density without the local surface change, $i_{eq}(1 - \theta_{PEG, eq})$, is assumed at 10mA.cm$^{-2}$. The value of $k_2$ and $\theta_{PEG, eq}$ are looked as reasonable physically. The model successfully predicts the super filling and bumps formation by fitting the parameters. However, Moffat had some comments about the kinetic parameter fitting. The parameter values will be reviewed in the later section.

Equation 3-7 and 3-9 can be modified by considering the available site for PEG absorption [54]:

$$\frac{d(\theta_{PEG})}{dt} = -k_2(\theta_{PEG} - K(1 - \theta_{SPS} - \theta_{PEG})) \quad (3-10)$$

$$\theta_{PEG, eq} = \frac{K(\theta_{SPS, eq})}{1 + K(\theta_{SPS, eq})} (1 - \theta_{SPS, eq}) \quad (3-11)$$

Moffat et al. [17] presents a feature-scale copper electroplating model called Curvature Enhanced Accelerator Coverage Model (CEAC). This model not only successfully address the issues of initial conformal film growth, bump over trench formation and additive surface coverage for void-free filling, but also is extend to explain the superconformal deposition for submicron features quantitatively in Au and Ag electroplating, and Cu CVD processes [12, 18, 19, 55, 56, 57]. In the CEAC model, the film growth rate is decided by the local accelerator surface coverage rather than the decrease of inhibition of suppressor in West’s model. The catalyst or accelerator keeps “floating” at the interface between copper film and electrolyte and would not be incorporating into the deposited film or consuming during the process. During the deposition, the local surface change will enrich the accelerator surface converge on concave surface or trench bottom, and dilute the surface coverage on the convex sits or
trench top. The curvature enhanced accelerator coverage mechanism becomes strong for small size features, especially submicron [19].

The local film growth rate [20] is experimentally derived and expressed in terms of the surface coverage of accelerator \( \theta \), overpotential \( \eta \), the cupric ion concentration \( C \) at the interface and the bulk cupric concentration \( C_{Cu} \) in the electrolyte:

\[
v(\theta, \eta, C) = \frac{C}{C_{Cu}} \nu_v(\theta) \exp\left(-\frac{\alpha(\theta)F}{RT} \eta\right)
\]  

(3-12)

The normal local film growth rate is proportional to the current density

\[
v = \frac{i\Omega}{2F}
\]

(3-13)

where \( \Omega \) is atomic volume of copper and \( F \) is Faraday’s Constant, 96485C/mol. The fractional accelerator surface coverage change \( \theta \) is determined by the following equation [58]:

\[
\frac{d\theta}{dt} = \nu_v \kappa \theta + k^+ (1 - \theta) C_a - k^- \theta^n
\]

(3-14)

where \( \nu_v \) is the normal growth rate, \( \kappa \) is the mean curvature of the interface, \( C_a \) is the accelerator concentration, \( k^+ \) and \( k^- \) describe the absorption and desorption/deactivation of the accelerator. The first term addresses the curvature enhanced coverage effect.

By assuming linear relation with \( \theta \) for \( \nu_v(\theta) \) and \( \alpha(\theta) \), Equation 3-12 can be rewritten as

\[
v(\theta, \eta, C) = \frac{C}{C_{Cu}} (b_0 + b_1 \theta) \exp\left(-\frac{(m_0 + m_1 \theta)F}{RT} \eta\right)
\]

(3-15)

Further, the exponential term can be ignored in certain conditions [59], then

\[
v(\theta, C) = \frac{C}{C_{Cu}} R_0 (1 + k \theta)
\]

(3-16)

where \( R_0 \) is the standard copper growth rate at no accelerator and \( k \) is a constant.
Based on the CEAC mechanics as mentioned, several methods are applied to simulate the superfilling deposition process on feature scale. Compared to 30 minute computation time for the front-tracking code simulation without considering the cupric ion and additive concentration position dependence near the trench surface, and the several hour computation time for the level-set code that introduces the diffusion equations of the cupric ion and additives, the simple geometrical model can dramatically reduce the computation time into second level by capturing the basic ideas of the superfill mechanism and using simple first-order differential equations. The simulation results are in good enough agreement with the results from much more complex codes and the real experiment data for a broad range of parameter space with few problems in predictive ability [20]. This model can search a broad range of parameter space for process optimization. At the same time, the simple model can be applied to no only the trenches but also the vias [19]. In fact, it can be extended to process arbitrary layout with proper modification. Another advantage is that it is easy to extract quantities related to step height and array height for the topography characterization from the simplified plating simulation model. The model framework is the prototype for this chip-scale model.
Figure 3-24. A schematic of the approximate geometry for the simple model [20].

Straight vertical and horizontal lines simplify the time dependent interface shape in the trench, as shown Figure 3-24. The evolution of accelerator coverage on the top, sidewalls and bottom of the trench is expressed in terms of the concentration of the accelerator $C_{\text{accelerator}}$, the diffusion coefficient $D_{\text{accelerator}}$, the number of available sites $I(1-\theta)$, a potential dependent rate constant $k(\eta)$, the sidewall growth rate $\nu_s$ and the bottom growth rate $\nu_b$ in Equation 3-17, 3-18 and 3-19.

\[
\frac{d\theta_s}{dt} = \frac{C_{\text{accelerator}}k(1-\theta_s)}{1 + \delta \Gamma k(1-\theta_s) / D_{\text{accelerator}}} \tag{3-17}
\]

\[
\frac{d\theta_b}{dt} = \frac{C_{\text{accelerator}}k(1-\theta_b)}{1 + \delta \Gamma k(1-\theta_b) / D_{\text{accelerator}}} + \frac{2\theta_s \nu_s}{w} + \frac{2\theta_b \nu_b}{w} \tag{3-19}
\]

The horizontal displacements of the sidewalls and the vertical displacement of the bottom surface are expressed in Equation 3-20 and 3-21.

\[x(t) = \int_{0}^{t} \nu_s[\theta_s(t), C_s(t)] dt = \int_{0}^{t} \nu_s(t) dt \tag{3-20}\]
\[ y(t) = \int \nu[\theta(t), C_b(t)] dt = \int \nu_b(t) dt \quad (3-21) \]

Cupric ion flux into the gap of the trench is expressed as
\[ w_{\text{gap}} \Omega_{\text{Cu}} D_{\text{Cu}} \nabla C \bigg|_{\text{gap}} = 2h_{\text{gap}} V_s + w_{\text{gap}} v_b \quad (3-22) \]

The linear assumption between concentrations of \( C_b \) and \( C_i \) (\( C_b = \beta C_i \)) and constant
\[ \nabla C = \frac{(C_i - C_s)}{h_{\text{gap}}} \]
can further simplify the last equation as
\[ w_{\text{gap}} \Omega_{\text{Cu}} D_{\text{Cu}} \frac{C_i(1-\beta)}{h_{\text{gap}}} = 2h_{\text{gap}} V_s + w_{\text{gap}} v_b \quad (3-23) \]

The velocity of top can be express as the following equation by considering the copper ion mass balance across the boundary layer:
\[ \Omega_{\text{Cu}} D_{\text{Cu}} \frac{(C_{\text{Cu}} - C_i)}{\delta} = v_t \quad (3-24) \]

After assuming \( C_s \approx C_i \) and \( v_s \approx v_t \), the cupric ion concentration for the sidewalls \( C_s \) and the cupric ion concentration for the bottoms \( C_b \) are written as Equation 3-25 and 3-26 respectively.

\[ C_s(t) = C_{\text{Cu}} - \frac{\delta v_t}{\Omega_{\text{Cu}} D_{\text{Cu}}} \quad (3-25) \]

\[ C_b(t) = \left( 1 - \frac{(h+x-y)}{(w-2x)} \right) \frac{(C_{\text{Cu}} \Omega_{\text{Cu}} D_{\text{Cu}} - \delta v_t)}{(C_{\text{Cu}} \Omega_{\text{Cu}} D_{\text{Cu}} - \delta v_t)} \times \left( C_{\text{Cu}} - \frac{\delta v_t}{\Omega_{\text{Cu}} D_{\text{Cu}}} \right) \quad (3-26) \]

The bump formation for submicron features in copper electroplating worsens the evenness of the post-plating topography, which is not preferable in the following CMP process. In the latest electroplating tools, accelerator-suppressor-leveler electrolyte is adapted to reduce the bump formation. The leveler can deactivate or exclude the accelerator and change the copper film deposit rate. The mass-transfer dependence of the leveler makes it play a major role after bump formation rather than before the trench is
filled. The attenuation effect of the leveler is shown in Figure 3-25. The competitive adsorption and desorption among the three additive are quantitatively formulated in the latest paper from Moffat’s group. The CEAC model is expanded into the so called “Curvature Enhanced Adsorbate Coverage” model [60]. In this model, the growth equation is expressed as a function of the three additive coverage and overpotential.

![Figure 3-25. Attenuation of bump formation during copper deposition [60].](image)

The effect of the leveler on copper growth rate is shown in Figure 3-26 [61]. It is clear that the leveler’s inhibition is only effective after the fill is complete.
Figure 3-26. Local copper growth rate in and over feature [61].

3.4.2 Chip-Scale Model

The stack information for M1 is shown in Figure 3-27. The trench depth, $h_0$, is 3500Å. As shown in the previous HRP scans, the calibrated ECD process is not the latest process without significant bumps, two additives, accelerators and suppressors, are considered in this chip-scale model. This model has the potential to be extended to three-additive case as the CEAC model.

In the first part, the simplified feature-scale topography evolution is discussed case by case. In the second part, an inter-feature cupric depletion effect is considered to explain the edge effect, which is observed in the previous HRP scans.
3.4.2.1 Basic Formalism

In order to simplify the film topography evolution, three cases will be discussed separately. As shown in Figure 3-28, the trench has not been filled in the consecutive time step at $t$ and $(t+1)$ at Case 1. For Case 2, the trench is not filled at time step $t$ but at time step $(t+1)$. Following Case 2, Case 3 has no uneven surface and is assumed the surface will move horizontally. Figure 3-29 shows Case 2 and Case 3. The transition of trench filling is simplified to further reduce the computational load. The real process is too complex to approximate the shape of the surface as straight lines like Case 1. And the real wave-like surface after the trench filling has no significant effects to the following CMP process modeling due to the fine dimension. Averaging the surface as a flat one provides the enough accuracy in the CMP model, since the surface of the pad is rough and has larger dimension than micron. In Figure 3-30, the surface evolution of same aspect-ratio submicron features with different pattern density is simulated in feature scale and with full details [62]. The prohibitive computational cost is not suitable in the chip-scale modeling. And for the fine features with high pattern density and thin copper field
thickness (short electroplating time), the flat surface is not deviated from the real profile significantly. Due to the short electroplating time used in the real process in the semiconductor industry, the abrupt transition at Case 2 is acceptable in most of case, especially for the following chip-scale CMP modeling.

Figure 3-28. Profile evolution before trench overfill at feature level.

Figure 3-29. Profile evolution after trench overfill at feature level.
Case 1

Because the accelerator and suppressor can produce superconformal deposition alone, both can contribute the growth equation for two-additive copper deposition rather than only accelerator or suppressor in Moffat or West models. After combing Equation 3-6, 3-13 and 3-16, the growth equation used in this project is

$$v(\theta, C) = \frac{C}{C_{Cu}} \frac{R_0 (1 + k \theta_{ACC} - \theta_{SUP})}{R_{eq}} \quad (3-27)$$

where $v$ is copper deposition rate, $k$ is a constant, and $R_0$ is the deposition rate constant when surface cupric ion concentration equals to the bulk cupric ion concentration, $C = C_{Cu}$, and factional surface coverage of accelerator, $\theta_{ACC}$, and suppressor, $\theta_{SUP}$, are zero.

For flat wafer without trench at equilibrium, the growth rate is rewritten as

$$v_{eq} = \frac{C_{eq}}{C_{Cu}} \frac{R_{eq} (1 + k \theta_{ACC,eq} - \theta_{SUP,eq})}{R_{eq}} \quad (3-28)$$

where $v_{eq}$ is copper deposition rate for flat wafer or field at equilibrium, $C_{eq}$, $\theta_{ACC,eq}$ and $\theta_{SUP,eq}$ are the equilibrium values for surface cupric ion concentration, $C$, and factional surface coverage of accelerator, $\theta_{ACC}$, and suppressor, $\theta_{SUP}$. Then Equation 3-27 can be rewritten as
\[ v(\theta, C) = \frac{C(1 + k\theta_{ACC} - \theta_{SUP})}{C_{eq}(1 + k\theta_{ACC,eq} - \theta_{SUP,eq})} v_{eq} \tag{3-29} \]

Based on Equation 3-13, if \( i_{eq} \) is assumed at 10mA/cm\(^2\), the corresponding \( v_{eq} \) is 3.68 nm/s. It is almost equivalent to use time-step or field thickness step in the modeling. The copper deposition rate at the top, side and bottom of the trench at time step \( t \) are expressed in the following three equations:

\[
v_{t}' = \frac{C_{t}'\left(1 + k\theta_{t,ACC,eq} - \theta_{SUP,eq}'\right)}{C_{eq}'\left(1 + k\theta_{ACC,eq} - \theta_{SUP,eq}'\right)} v_{eq} \tag{3-30} \]

\[
v_{s}' = \frac{C_{s}'\left(1 + k\theta_{s,ACC,eq} - \theta_{SUP,eq}'\right)}{C_{eq}'\left(1 + k\theta_{ACC,eq} - \theta_{SUP,eq}'\right)} v_{eq} \tag{3-31} \]

\[
v_{b}' = \frac{C_{b}'\left(1 + k\theta_{b,ACC,eq} - \theta_{SUP,eq}'\right)}{C_{eq}'\left(1 + k\theta_{ACC,eq} - \theta_{SUP,eq}'\right)} v_{eq} \tag{3-32} \]

\[
T_{u,t+1} = T_{u,t} + \Delta t v_{t}' \tag{3-33} \]

\[
T_{l,t+1} = T_{l,t} + \Delta t v_{b}' \tag{3-34} \]

\[
S' = T_{u,t} - T_{l,t} \tag{3-35} \]

\[
S_{t+1} = T_{u,t+1} - T_{l,t+1} \tag{3-36} \]

\[
B_{t+1} = B_{t} + \Delta t v_{b}' \tag{3-37} \]

where \( \Delta t \) is the time step, \( T_{u} \), and \( T_{l} \) are the surface position above the top and bottom of the trench referring to the top of the barrier layer respectively, \( B \) is the copper film bias on the side wall, and \( S \) is the step height, NOT the effective step height.

\[
S'_{eff} = \frac{S'D'}{D_0} \tag{3-38} \]

where \( D_0 \) and \( D' \) are the pattern density of the trench bottom area in each grid cell at time \( \theta \) and \( t \). In fact, \( D_0 \) equals to the pattern density directly from the layout extraction.
As shown in Figure 3-27, the pattern density, $D'$, and total pattern perimeter, $P'$, for each cell can be expressed as

$$D' = D^0 - \frac{B_i P - 4NB_i^2}{l^2}$$  \hspace{1cm} (3-39)$$

$$P' = P^0 - 8NB_t$$  \hspace{1cm} (3-40)$$

where $N$ is the object number in each grid cell and $l$ is the grid cell size in layout extraction.

In the part, the formalism of the surface cupric ion concentration at the top, side and bottom of the features are present based on the simple version of Moffat's model.

$$C'_{i+1} = C_{eq} R'$$  \hspace{1cm} (3-41)$$

$$C'_{s+1} \approx C'_{i+1} = C_{eq} R'$$  \hspace{1cm} (3-42)$$

$$C'_{b+1} = \beta' C'_{i+1} = \beta' C_{eq} R'$$  \hspace{1cm} (3-43)$$

$$C^0 = C^0_{b} = C^0_{s} = C_{eq}$$  \hspace{1cm} (3-44)$$

$$R' = \beta' = 1$$  \hspace{1cm} (3-45)$$

where $R'$ and $\beta'$ are inter-feature and inner-feature cupric ion depletion factor. $R'$ will be discussed in the next section. The Equation 3-23 for $\beta'$ can be rewritten in the grid cell condition.

$$D' \Omega_{cu} D_{cu} \frac{\left(1 - \beta'\right)}{S'} = 2 P'S' \frac{l^2}{l^2} v'_s + D'_v v'_h$$  \hspace{1cm} (3-46)$$

$$\beta' = 1 - \frac{S'}{C' \Omega_{cu} D_{cu}} \frac{l^2}{l^2} v'_s + D'_v v'_h$$  \hspace{1cm} (3-47)$$

$$\beta' = 1 - \frac{S'}{C_{eq} \left(1 + k \theta_{4CC,eq} - \theta_{SUP,eq}\right)} \frac{l^2}{l^2} v'_s + D'_v v'_h$$  \hspace{1cm} (3-48)$$
Let $L_{eq} = \frac{C_{Cu}}{v_{eq}} \Omega_{Cu} D_{Cu}$.

Using typical values for the following variables:

$v_{eq} = 3.68\text{nm/s}$, $\Omega_{Cu} = 7.11\text{cm}^3/\text{mol}$,

$C_{Cu} = 2.4\times10^{-4}\text{mol/cm}^3$ [56], $D_{Cu} = 4\times10^{-6}\text{cm}^3/\text{s}$ [56]

$L_{eq} = 185.5\mu\text{m}$

\[ \beta' = 1 - k'_3 \frac{S'}{L_{eq}} \left( \frac{2 S' S'' + D' v_b'}{D' v'_b} \right) \]

where \( k'_3 = \frac{C_{Cu}}{C_{eq}} \left( 1 + k_{ACC,eq} - \theta_{SUP,eq} \right) \)

Then \[ \left( \frac{2 S' S'' + D' v_b'}{D' v'_b} \right) \] expresses the ratio of the total copper deposited in the trench including the sides and bottom, versus the total copper deposited on the bottom area if the step height is zero; the value increases with aspect ratio and pattern density and should be less than 5 in most of cases. $k'_3$ also should be less than 10. Since $S'$ is much smaller than $L_{eq}$, $\beta'$ is more than 90% in most cases. In fact, 3% cupric ion concentration depletion is reported [62]. So inner-feature cupric ion depletion effect can be ignored in the modeling.

The surface coverage of accelerator and that of suppressor are calculated mostly based on West’s model.

\[ \theta_{ACC,eq} = \theta_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta_{SUP,eq} = \theta_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]

\[ \theta'_{ACC,eq} = \theta'_{ACC,b} = \theta_{ACC,eq} \]

\[ \theta'_{SUP,eq} = \theta'_{SUP,b} = \theta_{SUP,eq} \]
where the first and second terms describe the top surface change and accelerator absorption/desorption kinetics. Figure 3-27 (a) shows the surface change approximation for the top and the bottom.

\[
\frac{(1 - D')}{(1 - D'^*) + \frac{P'^*}{l^2}} < 1
\]

\[
K'_t = 30 \exp \left( -7 \sqrt{\theta'_{ACC,t}} \right)
\]  \hspace{1cm} (3-53)

\[
\theta'_{SUP,t}^{*+1} = \theta'_{SUP,t} + k_1 \left( \theta'_{SUP,t} - \theta'_{SUP,t} - \theta'_{ACC,t} \right) \Delta t
\]  \hspace{1cm} (3-54)

\[
\theta'_{ACC,t}^{*+1} = \theta'_{ACC,t} + k_2 \left( \theta'_{ACC,t} - \theta'_{ACC,t} - \theta'_{ACC,eq} \right) \Delta t
\]  \hspace{1cm} (3-55)

\[
K'_t = 30 \exp \left( -7 \sqrt{\theta'_{ACC,t}} \right)
\]  \hspace{1cm} (3-56)

\[
\theta'_{SUP,b}^{*+1} = \theta'_{SUP,b} + k_2 \left( \theta'_{SUP,b} - \theta'_{SUP,b} - \theta'_{ACC,b} \right) \Delta t
\]  \hspace{1cm} (3-57)

\[
\theta'_{ACC,b}^{*+1} = \theta'_{ACC,b} + \frac{D' \theta'_{ACC,b} + \theta'_{ACC,b} P'B}{D'^* l^2} - k_2 \left( \theta'_{ACC,b} - \theta'_{ACC,b} - \theta'_{ACC,eq} \right) \Delta t
\]  \hspace{1cm} (3-58)

\[
K'_b = 30 \exp \left( -7 \sqrt{\theta'_{ACC,b}} \right)
\]  \hspace{1cm} (3-59)

\[
\theta'_{SUP,b}^{*+1} = \theta'_{SUP,b} + k_2 \left( \theta'_{SUP,b} - K (1 - \theta'_{SUP,b} - \theta'_{ACC,b}) \right) \Delta t
\]  \hspace{1cm} (3-60)

Here, two copper growth rates will be defined. The nominal copper growth rate, \( R_N \), is the ratio between the deposited copper in one grid cell in one second and the area of the grid cell, however, the average copper growth rate, \( R_A \), is the ratio between the deposited copper in one grid cell in one second and the surface area in this grid cell. \( R_N \) is a variable to describe the total copper deposit rate through the corresponding boundary layer and \( R_A \) is good to show the accelerator accumulation process during copper electroplating.

\[
R'_N = D'V'_b + \left( 1 - D' \right) V'_t + \frac{S' P'}{l^2} V'_t
\]  \hspace{1cm} (3-61)

\[
R'_A = R'_N \frac{l^2}{l^2 + S' P'}
\]  \hspace{1cm} (3-62)
Case 2

If \( B' < W_i / 2 \) and \( S' > 0 \) but \( B'^{t+1} \geq W_i / 2 \) or \( S'^{t+1} < 0 \), the features is filled. \( W_i \) is the average feature width in each grid cell. In order to approximate the abrupt transition, the surface position and additive coverage are described in the following equations:

\[
v'_i = R'_N - \frac{S'D'}{\Delta t}
\]

\[
T_{U,t+1} = T_{L,t+1} = v'_i
\]

\[
B'^{t+1} = W_i / 2, S'^{t+1} = 0, D'^{t+1} = 0 \text{ and } P'^{t+1} = 0
\]

\[
\theta'^{t+1}_{ACC,i} = \theta'^{t}_{ACC,i}(1 - D') + \theta'^{t}_{ACC,i}D' + \theta'^{t}_{ACC,i} \frac{S'P'}{I^2}
\]

\[
K'_t = 30\exp(-7\sqrt{\theta'^t_{ACC,i}})
\]

\[
\theta'^{t+1}_{SUP,i} = \theta'^{t}_{SUP,i} - k_2(\theta'^{t}_{SUP,i} - K(1 - \theta'^{t}_{SUP,i} - \theta'^{t}_{ACC,i}))\Delta t
\]

Case 3

After Case 2, the surface will move horizontally as shown in Figure 3-28.

\[
v'_i = \frac{C'_j(1 + k\theta'^{t}_{ACC,i} - \theta'^{t}_{SUP,i})}{C_{eq}(1 + k\theta'^{t}_{ACC.eq} - \theta'^{t}_{SUP.eq})}v_{eq}
\]

\[
\theta'^{t+1}_{ACC,i} = -k_1(\theta'^{t}_{ACC,i} - \theta'^{t}_{ACC.eq})\Delta t
\]

\[
K'_t = 30\exp(-7\sqrt{\theta'^t_{ACC,i}})
\]

\[
\theta'^{t+1}_{SUP,i} = \theta'^{t}_{SUP,i} - k_2(\theta'^{t}_{SUP,i} - K(1 - \theta'^{t}_{SUP,i} - \theta'^{t}_{ACC,i}))\Delta t
\]

3.4.2.2 Inter-Feature Cupric Ion Depletion

In order to describe the edge effect of the fine arrays observed in the previous HRP scans, the inter-feature cupric ion depletion factor, \( R' \), is introduced to account the deviation of the surface cupric concentration at the top, \( C'_i \), from the equilibrium value \( C_{eq} \). Here, the deviation is assumed from the cupric ion depletion in the inter-feature
level. That means to averaging the nominal copper growth rate $R_N$ in the nearby grid cells decides the effective surface cupric ion concentration at the top. One explanation is that the bulk cupric ion concentration is not homogeneous due to mass transfer process in the electroplating reactor, such as convection.

\[ \Omega_{Cu} D_{Cu} \left( \frac{C_{Cu} - C_i}{\delta} \right) = \nu'_i \]  
(3-72)

\[ \nu'_i = \frac{C'_i \left( 1 + k \theta'_{ACC,eq} - \theta'_{SUP,eq} \right)}{C_{eq} \left( 1 + k \theta_{ACC,eq} - \theta_{SUP,eq} \right)} \nu_{eq} \]  
(3-73)

\[ C_{Cu} = C'_i \left( 1 + \frac{\delta}{\Omega_{Cu} D_{Cu} C_{eq} \left( 1 + k \theta_{ACC,eq} - \theta_{SUP,eq} \right)} \right) \]  
(3-74)

where \( \frac{\left( 1 + k \theta'_{ACC,eq} - \theta'_{SUP,eq} \right)}{\left( 1 + k \theta_{ACC,eq} - \theta_{SUP,eq} \right)} \approx 1 \) and thus \( C_{Cu} \approx C'_i \).

The inhomogeneous bulk cupric ion concentration can introduce an additional distribution onto the surface cupric ion concentration on the top.

Instead of the bulk cupric ion concentration variation, electric field or current distribution can be another reason. The cupric ion depletion in the significantly high nominal copper growth rate $R_N$ areas changes the resistance in the electrolyte. We still can use the nominal bulk cupric ion concentration variation to approximate the over all effects in the electroplating.

It is natural to think that the inter-feature level depletion effects have the contribution from the nominal copper deposition rate of the nearby grid cell instead of a single cell. $M'_R$ and $M'_R$ are un-normalized and normalized inter-feature cupric ion depletion factor.

The grid cell with the highest value in $M'_R$ usually is in a wide field region, where the surface cupric ion concentration is close to the equilibrium value. The weight function
used in Equation 3-68 is the “error” function filter. The filter function is selected after screening the popular weight functions. $L_{ECD}$ is the characteristic length of the filter and capture the scope of the nearby grid cells which affect the cupric ion concentration on a specific grid cell. $\alpha$ is called inter-feature cupric ion depletion index.

$$M'_{R^c} = \left[ F_{\text{filter}}(M_{R^c}, L_{ECD}) \right]^{\alpha}$$  \hspace{1cm} (3-75)

$$M'_{R^c} = M'_{R^c} / \text{Max}(M'_{R^c})$$  \hspace{1cm} (3-76)

The following three figures compare the nominal copper deposition rate maps and the inter-feature cupric ion depletion factor maps at various times. It is clear that the correlation between the two kinds of maps.
Figure 3-31. Inter-feature cupric ion depletion effect at 10 sec.

Figure 3-32. Inter-feature cupric ion depletion effect at 60 sec.

Figure 3-33. Inter-feature cupric ion depletion effect at 140 sec.
3.5 Model Calibration

The model parameters are fitted by using the Array height and effective step height from the four wafers: CPT 104-01, CPT 105-01, CPT 115-04 and CPT 115-07. Due to the HRP limit to measure the deep trench of fine features (less than 2 micron), no data from CPT 115-01 is used in the model parameter calibration process.

The first part of this section will discusses the fitted model parameters and compare the extracted data with the simulation results. Then the HRP simulation results will be compared to the HRP scans. The time evolution of the chip-scale simulation will be presented in the last part.

3.5.1 Calibration Results

The fitted model parameters:

\[ K = 15, \quad k_1 = 0.035 \text{ s}^{-1}, \quad k_2 = 0.005 \text{ s}^{-1}, \quad \theta_{\text{ACC,eq}} = 0.05 \]

\[ L_{ECD} = 1200 \text{ \mu m}, \quad \alpha = 0.85 \]

In Josell’s paper describing the simplified version of the CEAC model, the equivalent \( K \) value is 9.3 [20]. Although the fitted values for \( k_2 \) and \( \theta_{\text{ACC,eq}} \) in West’s paper [14] are same as the values used in this research, the fitted kinetics parameter from the accelerator, \( k_1 \), in his paper is 1 s\(^{-1}\). The difference might come from the process variation and different model formalism, especially the growth rate equation. In this research, the contributions of the accelerator and suppressor are considered at the same time. In fact, the model is very sensitive to the value of \( k_1 \). In Park’s paper [63], the similar electroplating characteristic length used to capture the field thickness variation is 4680 \( \mu \text{m} \). The smaller \( L_{ECD} \) is focus on fitting the edge effect rather than the field thickness variation. In fact, a larger value of \( L_{ECD} \) can reduce the fitting errors in the field
thickness variations. For this specific electroplating process, the value of the inter-feature cupric ion depletion index, $\alpha$, is relatively large. The data used in Park’s thesis show a much lower value down to 0.2 ~ 0.4. That means that the process used in this research is not fully optimized. The edge effect might bring defects in the electroplating and the subsequent CMP process, although the bump formation is attenuated by this kind of depletion effect.

The following table lists the RMS fitting error for array height, effective step height and field thickness. Generally, the fitting errors of the array heights and effective step heights increase with the electroplating time. However, this model can capture the variations of the array heights and effective step height of CPT 104-01, which is the baseline process.

Table 3-1. ECD chip-scale modeling error for MIT/SEMATECH 854 M1.

<table>
<thead>
<tr>
<th>Wafer</th>
<th>Array Height (Å)</th>
<th>Effective Step Height (Å)</th>
<th>Field Copper Thickness (Å)</th>
</tr>
</thead>
<tbody>
<tr>
<td>CPT104-01</td>
<td>157</td>
<td>148</td>
<td>83</td>
</tr>
<tr>
<td>CPT115-04</td>
<td>252</td>
<td>285</td>
<td>86</td>
</tr>
<tr>
<td>CPT105-01</td>
<td>243</td>
<td>291</td>
<td>98</td>
</tr>
<tr>
<td>CPT115-07</td>
<td>250</td>
<td>378</td>
<td>89</td>
</tr>
</tbody>
</table>

We next consider the sensitivity of the fitting parameters. $L_{ECD}$ and $\alpha$ are responsible for the edge and corner effects from the inter-feature cupric ion depletion. A 50% increase or decrease in these two parameters only results in approximately 5% change in the overall RMS error (for the range of different wafers considered in Table 3-1). However, the change in these two parameters can dramatically change the profile at the edge of the sub-micron arrays. A 50% change in $K$, $k_2$ and $\theta_{acc,eq}$ can result in a 5 to 10% increase in the overall RMS error. However, a 10% change in $k_1$ can result in a 5 to
10% increase in the overall RMS error. The sensitivity analysis further confirms the dominating role of accelerators in copper electroplating.

The following eight figures compare the data values and simulation results of the array heights and effective step heights from the four test wafers. The resources of the fitting errors will be discussed after reviewing these figures.
Figure 3-34. Array height data and simulation results for CPT104-01

Figure 3-35. Array height data and simulation results for CPT115-04.

Figure 3-36. Array height data and simulation results for CPT105-01.
Figure 3-37. Array height data and simulation results for CPT115-07.

Figure 3-38. Effective step height data and simulation results for CPT104-01.

Figure 3-39. Effective step height data and simulation results for CPT115-04.
Considering the data extraction error is 100 Å or so, the small fitting errors of the array height and effective step height for CPT 104-01 confirm the effectiveness of the straight line profile approximation, as shown in Figure 3-34 and 3-38. The increasing errors for the subsequent ECD test wafers show that this kind of approximation has its own limits. In fact, this approximation follows the real profile evolution well for the fine features with relatively short electroplating time or thin film thickness. For the wide features, especially with high pattern density, the deviation of the approximation from the real profile will be aggravated with the electroplating time, as shown in the Figure 3-42.
While the bias close to the top of the trench is over-estimated, the bias close to the bottom is under-estimated. This deviation brings large errors for the features over 1 µm. This tendency can be seen clearly in CPT 115-07, which has almost twice the electroplated copper thickness as does CPT 104-01. The straight line approximation would predict that the trenches are filled for the features from 1 µm to 3 µm and have no step heights, which is not consistent with the data. The significant errors in the effective step height for the features larger than 5 µm are due to the under-estimation of the film growth at the trench bottom. Fortunately, the modern copper electroplating process used in the semiconductor industry prefers the thinner copper thickness due to the throughput consideration, and this model can provide relatively accurate chip-scale simulation for the subsequent CMP process with reasonable computation time.

Figure 3-42. Copper electroplating profile for the wide trench and narrow line space.
Figure 3-43. Field copper thickness data and simulation results for CPT104-01.

Figure 3-44. Field copper thickness data and simulation results for CPT115-04.

Figure 3-45. Field copper thickness data and simulation results for CPT105-01.
Figures 3-43 through 3-46 compare the data values and simulation results of the field copper thickness from the four test wafers. It is observed that the overall tendency of the field copper thickness variation in the four test wafers is captured in the model, except for the full magnitude of the variation amplitude. This deviation might come from three aspects. The field copper thickness close to the fine features is sensitive to the test location due to the edge effect. However, the location error from the metrology tool cannot be completely avoided. Beside the positioning error, the thickness accuracy of the tool is in the range of the variation. This inherent error brings the difficulty to capture the true field copper thickness variations. As mentioned before, the large value of $L_{ECD}$ can significantly decrease the fitting error for the field thickness variation, but lose the capability to catch the edge effect, and vice versa. That implies that the filter used in calibrating the inter-feature cupric ion depletion effects is not fully optimized or there are different physical reasons for the edge effect and the field copper thickness variation. The captured variation tendency makes the first argument more eligible. It is suggested that a more complex weighting function with two or more characteristic lengths might be a better choice. The prohibitive computation cost to optimize the model parameters, the
over-fitting concern, the small errors compared to other fitting errors and measurement errors and the fast electroplating technology progress make this improvement less valuable to pursue at this time.

Figure 3-47 compares the nominal copper deposition rate of the fine arrays. The abrupt changes in the deposition rate indicate the sharp transition from the unfilled trenches to the filled trenches. The times for the trench filling are 26 sec for 0.18 μm, 38 sec for 0.25 μm, 68 sec for 0.5 μm and 120 sec for 1 μm respectively. Before the trench filling, the nominal deposition rate is decided by the real surface area in each grid cell and the inter-feature cupric ion depletion effect. The sudden drop in the deposition rate is due to the dramatic surface area shrink and the simplification at Case 2. The response of the suppressor to the dramatic accumulation of the accelerator makes the deposition rate rebound but can not reach the same rate before the trench filling. The accelerator accumulation just after the trench filling increases the deposition rate decay speed. In the real process, this kind of the transition is smooth and without this artifact from the simplification in the modeling.
Figure 3-47. Nominal copper deposition rate vs. electroplating time for fine features.

Figure 3-48 compares the average copper deposition rate of the fine arrays. The different way to account the surface areas changes the shape of the curves. If the sharp peaks from the artifact in the model simplification are smoothed out, the curves are close to Figure 3-26, which is from the real experiment data [61]. This observation further confirms the effectiveness of this chip-scale model. The simplification in the chip-scale model still captures the key phenomena in the real process and the feature-scale models.

In order to extend this chip-scale model to address the accelerator-suppressor-leveler three-additive system, the depletion of the leveler and the competitive absorption and desorption among the three additives have to be considered. Unlike the accelerator and suppressor, the leveler is highly mass-transfer dependent. A simple way to modify this model is to make the kinetics rate of accelerator, $k_1$, a function of the surface coverage of the accelerator and leveler. The high surface coverage of the accelerator and leveler at the same time will significantly increase the desorption speed as shown in Figure 3-26.
3.5.2 *HRP Scan Simulation*

The simulation results with 20-μm resolution will be compared to the HRP scans with 0.2-μm resolution in the following figures. The simulation results capture most of the height variations in the leveled and adjusted HRP scans, especially the edge effects in the fine arrays. The successful simulation further justifies this chip-scale ECD model.

The HRP limits in the deep trench of the fine features are clearly shown in the Figure 3-49. Although the HRP scans are not used in the model calibration, the profiles of the features over 2 μm and the filled finest features are captured accurately.
Figure 3-48. Average copper deposition rate vs. electroplating time for fine features.
Blue: HRP scan data, Red: grid top, Yellow: grid bottom

Figure 3-49. Array topography simulation for CPT 115-01 (X: µm, Y: Å).
Figure 3-50. Array topography simulation for CPT 104-01 (X: μm, Y: Å).
Figure 3-51. Topography simulation of several arrays for CPT 104-01 (X: μm, Y: Å).
Figure 3-53. Array topography simulation for CPT 115-04 (X: μm, Y: A).
Figure 3-54. Array topography simulation for CPT 105-01 (X: μm, Y: μm).
Figure 3-55. Array topography simulation for CPT 115-07 (X: µm, Y: A).
Not only the HRP array scans are accurately replicated by the chip-scale ECD model, but also this model simulates consistently the long scans which are not used in the data extraction, as seen in Figure 3-51 and 3-52. This kind of internal verification is a good supplement of the external verification in the next section.

3.5.3 Chip-Scale Simulation

Figure 3-56 shows the pattern density and line width maps for MIT/SEMSTECH 854 M1 mask. The evolution of the envelope and effective step height maps is shown in the sequence of Figures 3-57 to 3-64. The chip-scale simulations capture the edge effect and the pattern dependency accurately and efficiently.
Figure 3-56. MIT/SEMATECH 854 M1 mask.

(a) Pattern density map (%)  
(b) Line width map (μm)

Figure 3-57. Chip-scale ECD modeling at t=10 sec.

(a) Envelope map (Å)  
(b) Effective step height map (Å)

Figure 3-58. Chip-scale ECD modeling at t=20 sec.
Figure 3-59. Chip-scale ECD modeling at t=30 sec.

Figure 3-60. Chip-scale ECD modeling at t=40 sec.

Figure 3-61. Chip-scale ECD modeling at t=60 sec.
Figure 3-62. Chip-scale ECD modeling at t=100 sec.

Figure 3-63. Chip-scale ECD modeling at t=200 sec.

Figure 3-64. Chip-scale ECD modeling at t=350 sec.
3.6 Model Verification

Another Magna Chip Internal mask is used to verify the validity of the new model framework. The verification mask has a wider line width and density design space than MIT-SEMATECH 854 M1 mask as well as a higher spatial frequency in pattern variation with some wide structures. The strict verification case close to the real layouts in production provides a good platform to evaluate the new model. Figure 3-65 shows the pattern density and line width maps for the layout.

![Pattern density map (%)](image1)
![Line width map (µm)](image2)

Figure 3-65. MagnaChip internal verification mask.

Figure 3-66 compares the array height data to the simulation results for the Magna Chip internal verification mask based on the calibrated chip-scale ECD model. The RMS error 273 Å is reasonable by considering the model fitting errors. Although MagnaChip did not provide the step height data, the available data confirm the validity of this chip-scale model in dealing with the arbitrary layout which is far away the MIT/SEMATECH 854 test mask. The chip-scale envelope and effective step height simulation for the verification mask is shown in Figure 3-67.
Figure 3-66. Array height data and simulation results for MagnaChip verification mask.

(a) Envelope map (Å)  
(b) Effective step height map (Å)

Figure 3-67. Chip-scale ECD modeling for MagnaChip verification mask.
3.7 M2 Modeling and Simulation

In order to extend this model to address multilevel copper metallization, the specially designed multilevel experiment is to gather information needed for modeling the effect of underlying topography (e.g. from M1) on the polish of upper level metal layers (e.g. M2). The M1 mask is used to generate topography across the test chip, arising from the wide range of pattern densities and pattern features on this mask. The platform for this experiment is on the MIT/SEMATECH 854 test mask set. Figure 3-68 and Figure 3-69 are the pattern density and line width maps for the via layer and M2 layer of the test mask set.

![Pattern density map (%)](a) Pattern density map (%) ![Line width map (μm)](b) Line width map (μm)

Figure 3-68. MIT/SEMATECH 854 via mask.

From Figure 3-68, the line width in the via layer is submicron and the pattern density is very low. Since the low-pattern-density sub-micron features in the via layer can not affect the final topography significantly, only the M2 layout is considered in modeling the chip-scale pattern dependency modeling and simulation. The M2 mask only contains
0.5/0.5 μm array structures, which are overlaid in different locations on the M1 structures.

Figure 3-69. MIT/SEMATECH 854 M2 mask.

The complete film stack information for the M2 wafers is shown in Figure 3-70.

Figure 3-70. Stack information for MIT/SEMATECH 854 M2 wafers.

Due to the limited post-CMP height variation in the M1 wafers from the calibrated CMP processes, a special polishing step is applied to exaggerate the pattern dependency effects and ensure that large enough topography remains at the end of the M1 CMP step,
which makes it easier and more accurate to capture the multilevel effect in the M2 CMP model. After the specially designed M1 CMP process, the interlevel dielectric layers are deposited on these wafers. Then an ordinary dual-damascene process is followed on these wafers. The details of the process information for each M2 wafers have been briefly summarized in Chapter 2.

![Figure 3-71. HRP array scans and long scans for post-CMP M1 and M2 wafers (M1: blue, M2: magenta).](image)

Various measurements, such as copper thickness, oxide thickness, and profile (dishing and erosion) as for the previous M1 polishing, will be needed of the polished M1 structures to understand the pre-electroplating topography for the M2 wafers. The HRP scans and non-constructive copper thickness measurements are used in the M2 wafers to capture the ECD pattern dependency. The method and procedure for the M2 wafers are
similar to the M1 wafers. However, the HRP scans range has to be extended to cover the whole M1/M2 areas including the non-overlapped regions as shown in Figure 3-71 and 3-72. The data extraction sites are highlighted in Figure 3-72. The copper thickness measurement sites, as shown in Figure 3-73, have much stringent positioning accuracy due to the narrow field regions without M1 and M2 patterns.

![Mask Layout](image)

Figure 3-72. M2 structures and HRP data extraction.

In order to rebuild the topography of the pre-electroplating topography, surface response model is applied to model the envelope, effective step height and surface average on each grid cell. The method of use two variables to describe the pre-electroplating surface topography brings difficulty in the higher level ECD and CMP model. Considering the small dishing for the features with feature size less than the grid cell size, it is possible to use one variable, grid cell surface average $T_{avg}$, to approximate
the pre-electroplating surface. And this approximation will be improved with smaller grid cell size. The two variables will have the same value for the grid cells with 100% pattern density and at this case one variable is enough to describe the surface of the grid cells with 100% pattern density. With more advanced computers, the current grid cell size can be further lower to 10 μm. Figure 3-74 compares the array topography simulation results with the HRP scan data.

Figure 3-73. Locations of 52-sites for M2 copper thickness measurements (M1: blue, M2: magenta).
Blue: HRP data, Red: top, Yellow: bottom, Green: surface average

Figure 3-74. Array topography simulation for M2 pre-electroplating topography.
Before extending the current chip-scale ECD model into the multilevel framework, the question about the possible coupling effect between the underlying initial topography with the trench of the higher level mask has to be answered. The question can be decomposed into two aspects: whether or not the initial uneven topography will affect the mass transfer of cupric ion and additives in the electrolyte and how the additional surface compared to the flat wafer contributes the copper deposition process. As discussed in the previous sections, the depletion effect of the cupric ion and additives in the feature level can be ignored. The much less height variation in the underlying topography compared to the trench depth of the M2 patterns can not affect the mass transfer in the feature level. The uneven underlying topography can introduce additional surface to absorb the additives, as shown in Figure 3-75. However, the additional surface is dominated the trench side areas and the contribution can be ignored as shown in the following figures. The additional surface faction, \( \sim 2h_{u1}/l_{u1} \), can be reduced to \( \sim h_{u1}/l_{u1} \) after M2 patterning. On the contrary, the additional surface from M2 patterns is \( \sim 2h_{M2}/l_{M2} \). Since \( h_{M2} \gg h_{u1} \), the additional surface from the underlying topography basically can be ignored in the higher level ECD modeling.

Due to the weak coupling between the underlying topography with the M2 patterns in terms of the ECD superfilling process, the method to simulate the higher level chip-scale ECD topography is to superpose the underlying topography with the topography just from the higher level pattern by assuming the even initial surface before the patterning.
The array height RMS error for M2 post-electroplating topography is 176 Å. Considering the underlying topography RMS error is about 100 Å, the result validates the superposition methodology for higher level ECD topography modeling. Figure 3-76 compares the height variation simulation with the HRP scan data for the M2 post-electroplating wafer. Figure 7-77 shows chip-scale ECD modeling for the M2 post-electroplating wafer.

Figure 3-75. Uneven underlying topography.
3.8 Conclusions

A chip-scale ECD model is developed by incorporating the effective and proved feature-scale. The additive competitive absorption and accumulation/dilution due to surface area change are considered by reasonable simplification. At the same time, the edge effect is identified and addressed in this model. The calibration results show that this model can accurately capture the electroplated copper surface height variation in the ordinary process window used in the semiconductor industry including the previous ignored edge effects. The verification results from another mask with different design style confirm the validity of this chip-scale in processing an arbitrary layout design. The layout extraction and the two-dimension implementation in the model make it possible to process any feature size and shape. Further, this model is extended to deal with copper multi-level metallization. The pattern dependent pre-electroplating topography from the previous processes of the underlying level, especially CMP, has been concluded no
significant contribution to the pattern dependency of the following ECD process. The final copper film topography for higher level interconnects can be looked as the superposition of the pre-electroplating topography and the pattern dependent topography only from the higher level layout design itself.

The advantages of computation efficiency, good accuracy, layout flexibility and multilevel framework of the chip-scale ECD model make it practical to use this model in the ECD/CMP process integration and optimization, layout screening and dummy design. The applications of this chip-scale ECD model with the subsequent chip-scale CMP model will be present in the later chapter.
Chapter 4

Coherent Chip-Scale Modeling for Copper CMP

The chip-scale ECD model in the previous chapter provides the initial topography for the subsequent CMP process, including the single level and multiple level cases. In this chapter, an improved and coherent chip-scale CMP model framework for copper bulk polishing, copper over-polishing, and barrier layer polishing is presented, and this model is extended to cover single level and multi-level cases as in the copper ECD model. In the new CMP model, the integration of contact wear and density-step-height models is more seamlessly implemented and addresses inherent shortcomings of the previous model.

In the new model, a local density is used instead of the effective density computed by way of a planarization length, and only a contact wear coefficient is used to characterize the long-range planarization capability, thus avoiding the conflict between the planarization length and the contact wear coefficient in capturing topography variation. In addition, the pressure computed for each 180×180 µm block using contact wear is further redistributed among 20×20 µm cells within that block. At the cell level, the height and width distributions of the pad asperities are considered to calculate the pressure value at the trench top and bottom. The same model framework is used for different polishing steps, so that it is possible to directly compare basic process characteristics, such as pad stiffness, of different polishing stages. Results with the new model show a significant improvement of the modeling accuracy to 100 Å or so of root mean square error.
In this chapter, the previous CMP models closely related to the current work, especially contact wear models and density-step-height models, are briefly reviewed. Then, the new CMP model with a three-part framework is introduced, and an extension is presented in which the pad asperity height and contact size distributions are accounted for. Finally, CMP model calibration and verification for the first level and second level interconnects are presented.

4.1 Previous CMP Models

The post-CMP surface topography variation is a major concern in copper interconnect formation. The modeling of pattern dependencies is needed in order to understand the fundamental limitations of interconnect fabrication technologies, to verify yield or performance problem areas on product chips, and to assist in new process design with enhanced robustness, reliability and manufacturability. In addition, surface topography can be aggravated by multilevel interconnect structures prevalent in real chips, where the surface non-planarity of lower level copper metallization can influence the higher level topography. Unlike in previous technology generations, the ITRS 2003 roadmap [64] also specifically notes the need for CMP and interconnect topography modeling for current and future technology nodes, with modeling errors within tens of Å.

With shrinking of interconnect dimensions (both vertical and lateral) and improvements in CMP processes, the previously reported copper CMP pattern dependence model [49], developed at the quarter micron technology node, faces accuracy limits. The latest CMP data shows that the model prediction errors are comparable to the reduced topography variation, indicating the need for an improved model.
CMP is a complex process involving mechanical and chemical synergetic effects. Various models have been proposed to explain wafer-scale or feature-scale effects, as well as chip-scale thickness variations. Contact mechanics and step-height-dependent models are two major methods used to simulate the chip-scale pattern dependencies in CMP processes. A brief summary of these two models and their integration will be introduced in the following paragraphs.

The starting point to model the CMP process is to compute the pressure distribution on the uneven or patterned wafer surface. As mentioned in the first chapter, the pads, or more specifically, the pad asperities directly contact the surface of the wafer in the ordinary CMP process window (process conditions which introduce hydroplaning, for example, are possible, but these are not used in practice). In the direct contact case, the contribution of the slurry motion to the contact pressure can be ignored and the pressure distribution is mainly determined by the wafer-pad contact [65]. Thus, the global planarization effect in CMP can be explained by the deformation of the pad and resulting differential pressures due to the wafer topography. Various mechanics methods [65] are used to describe the pad deformation and the contact pressure distribution, such as a simple beam model [66] and finite element methods (FEM). Chekina et al. [65], Yoshida [67] and Vlassak [68] proposed various models to compute the pad elastic deformation and the contact pressure distribution during the polishing based on contact mechanics [69] and wear-contact theory [70]. These contact mechanics based wear models, referred to generally as the contact wear model in this thesis, are widely used to model and simulate the copper and dielectric CMP processes.
In Chekina’s paper [65], the 2D formula to describe the relation between the displacement of the wafer surface \( w(x, y) \) and the contact pressure distribution \( p(x, y) \) is

\[
w(x, y) = \frac{(1 - \nu^2)}{\pi E} \int_{\omega} \frac{p(\xi, \eta)}{\sqrt{(x - \xi)^2 + (y - \eta)^2}} d\xi d\eta
\]

\[(4-1)\]

\[
w(x, y) = f(x, y) + c \quad (x, y) \in \omega \quad w(x, y) > f(x, y) + c \quad (x, y) \notin \omega
\]

\[(4-2)\]

\[
p(x, y) \geq 0 \quad (x, y) \in \omega \quad p(x, y) = 0 \quad (x, y) \notin \omega
\]

\[(4-3)\]

\[
P = \int_{\omega} p(\xi, \eta) d\xi d\eta
\]

\[(4-4)\]

where, \( c, \omega, \) and \( P \) are penetration thickness, contact area and total known load \( P \) respectively. This formula also can be modified into the 1D case to model the line array as shown in Figure 4-1. Compared to other methods to solve the displacement or contact pressure distribution at the interested areas, the contact wear model provides an accurate method with modest computational cost, when applied to relatively small problems (in terms of number of discretized elements). Only the pad surface mechanical properties and deformation are considered, rather than the whole thickness of the pad as in FEM [65], and the resulting savings in computation time gives this method a significant advantage.

However, the computational limitations make any attempt to deal with features across the whole chip infeasible: for example, 0.01 \( \mu \text{m} \) discretization (as might be needed for accurate feature scale prediction) across a 20 mm by 20 mm chip gives \( 4 \times 10^{12} \) elements, which is not feasible with realistic computational or memory resources.
In Yoshida's paper [67], a boundary element methodology (BEM) is proposed to deal with the problem in the contact wear model that neither the initial pad displacement nor the initial contact pressure distribution at the areas of interest is available. The methodology is to discretize the wafer surface into small cells and assume that some cells are in contact (where pressures are calculated based on wafer topography) and other cells are not in contact (where pad displacements are calculated based on pressure); the assumed pad displacements and contact pressures are then used to solve the other
unknown pad displacements and contact pressures to completely construct the pad 
displacement and contact pressure distribution at the areas of interest [67].

The pad asperity distribution or roughness has also been applied to the feature-scale 
pattern dependence in CMP by a number of researchers, including Yu et al. [71, 72] and 
Vlassak [68]. Vlassak uses a contact mechanics analysis to evaluate the local pressure 
distribution between features on the wafer and the polishing pad based on the compliance 
of the pad and its roughness, and thus predicts dishing and erosion during CMP, which 
are controlled by the local pressure distribution [68]. Most of the effects of pattern 
density, line width, applied down-force, selectivity, and pad properties, on both dishing 
and erosion evaluated by using the model are in good agreement with the available data 
by Tugbawa [38]. The model captures several physical fundamentals in the CMP process 
related to pressure dependence.

![Figure 4-2. A Schematic of the Contact Model [68].](image)

The contact of a compliant polishing pad with the surface of a rigid wafer, as 
proposed by Vlassak [68], is shown in Figure 4-2. The rough surface of the pad contains 
asperities with a given height exponential distribution as in Equation 4-5.

\[
P(z) = \frac{1}{2\sigma} \exp \left( -\frac{|z|}{\sigma} \right) \tag{4-5}
\]
where \( z \) is the height of the asperity above or below the pad surface and \( \sigma \) is the characteristic roughness parameter for asperity height variation, which can be measured in a specific pad. During CMP, some of the pad asperities directly contact the wafer. The pad is elastically deformed and the force, which can be calculated by applying contact mechanics, is transferred from the pad to the wafer. The contact pressure distribution can be derived from Equation 4-6 if the pad deformation \( w(x,t) \) is known. The reverse case, where pressure is known and deformation is needed, can be solved by Equation 4-7.

\[
p(x,t) = \frac{4E}{3\sqrt{\kappa}} \frac{\nu}{1-\nu^2} \exp\left( -\frac{w(x,t)-T(x,t)}{\sigma} \right) \quad \text{for} \quad w - T \geq 0 \quad (4-6)
\]

\[
w(x,t) - C(t) = -\frac{2(1-\nu^2)}{\pi E} \int_{-L/2}^{L/2} \left[ p(s,t) \ln \left| \sin \frac{\pi(x-s)}{L} \right| \right] ds \quad (4-7)
\]

where \( E \) and \( \nu \) are Young’s Modulus and Poisson’s ratio of the pad, \( \kappa \) is the curvature of the pad asperities, and \( C(t) \) is a time-dependent constant displacement shift to maintain the total pressure conservation [68]. Then the local contact pressure is used to determine the local removal rate using Preston’s equation. In addition, a non-Prestonian relation common in real CMP processes can be used instead if the removal rate increases monotonically with pressure [68]. The values of \( E \) and \( \sigma \) used in this research are 20-45 MPa and 0.005-0.01 \( \mu \)m respectively. Apparently, these values are much lower than the independently measured data. This issue will be discussed further in a later section.

The contact wear model can be applied at various length scales in CMP modeling and simulation, not only at the feature scale as shown in the previous two figures, but also at the chip scale and wafer scale. Tugbawa [49] developed an integrated chip-scale copper CMP model by incorporating the contact wear model on the chip-scale with the density-step-height model on the feature-scale. The motivation for the use of the contact wear model...
model is to account for the contact pressure distribution due to the global height variation from the previous copper ECD process. Xie [73] also applied the contact wear model to understand the wafer edge pressure distribution affected by the pressure and edge geometry of the wafer and the retaining ring, and the gap between the wafer and retaining ring.

Although the contact wear model decreases the computation load dramatically compared to the FEM methods and is applicable in various scale in the CMP, it is difficult to use this model across the whole layout to model the contact pressure distribution and pad deformation down to the feature scale. The discretization size at the scale of the features of interest, and the need for calculation of the time-stepped evolution of the contact pressure distribution and pad deformation make the computation prohibitive for chip-scale modeling. An alternative model with reasonable accuracy and low computation load is a must-have to handle the pressure distribution on the feature scale and to relax the discretization size for the contact wear model, so as to significantly lower the computation cost for the chip-scale modeling and simulation. The density-step-height model is such an approximate model, meeting these requirements. In the next section, the density-step-height model will be reviewed briefly.

There are two major components in the density-step-height model: effective pattern density and step-height model. Stine et al. proposed analytic solutions for pattern dependencies in CMP based on the concept of effective pattern density [74]. As mentioned in the last chapter, the effective pattern density is calculated by passing the layout extracted local pattern density through a specific filter with a characteristic width called the planarization length. The planarization length defines the averaging range and
effectively captures the long-range pad bending effect. In the pattern-density model, the “up area” (raised feature) polishing rate is the blanket removal rate divided by un-patterned area fraction, which is equal to 1 minus the effective pattern density at the same location. Here, pattern density is defined as the recessed pattern area fraction over the total wafer surface at a certain location or cell; this definition is different than in the Stine paper [74] (which focused on up area pattern density), but is used to be consistent with derivations used in the model developed later in this thesis. There is no material removal at the “down area” (recessed regions between individual features) until the step-height is completely removed in the original Stine model [74]. Thus an implied assumption in this model is that the pad is incompressible. However, this assumption can be relaxed to account for both up area and down area polishing [74, 75, 76]. In the step-height model, the polishing rates at the up area and down area are proportional to the step height if the step height is less than a threshold value. This assumes that the pad is compressible over the feature scale, unlike in the pure pattern density model where the pad is considered to be incompressible and contact is only made with in raised areas. If the step height is over the threshold, no materials will be removed in the down area as the recessed region is far enough below the surface that the pad does not make contact [75, 76]. The density-step-height model combines the pattern density model and the step height model [77]. In this model, the effective pattern density concept is used, together with a step-height dependent pressure.
Figure 4-3 shows the framework of the density-step-height model, where the removal rate is expressed as a function of step-height (or dishing) and effective pattern density. The empirical Preston removal relation ($K = k_p \rho v$) is incorporated in the density-step-height model. In Figure 4-3, $K$ is blanket wafer polishing rate, and $\rho$ is the effective pattern density which is assumed to be independent of time evolution. The up area polishing rate is $K/(1-p)$, assuming that no down area polishing occurs until the local step height is less than the effective contact height $h_e$. With the decrease of step height, the pad starts to contact the down area and polishes both the up and down areas at different removal rates in the single material polishing case until the step height is zero. Figure 4-3 shows the simplified linear removal rate dependency on step height. In fact, the effective contact height $h_e$ is analogous to the inverse of the elastic coefficient in Hooke’s Law, resulting in the linear changes in the removal rates with the step height. A steady-state of
step height can be achieved in the dual material polishing if the removal rates at the up area and down area are equal at the specific value of step height as shown in Figure 4-3.

Xie et al. re-examined the physical basis for the density-step-height model by using the contact wear model [53]. The results of the two model comparison shows that the assumption and approximations used in the density-step-height model are good simplification to the more accurate contact wear model as shown in Figure 4-4. Basically, the density-step-height model can be looked at as an approximate version of the contact wear model on the feature scale. Another conclusion from Figure 4-4 is that an exponential relation between pressures at the up and down areas and the step height is a better approximation rather than the linear approximation used in the density-step-height model.

![pressure vs. step height](image)

Figure 4-4. Simulated pressure dependence on step height [53].

The contact height $h_c$, can be approximately linearly fitted with \((\text{line width})/(1-\text{effective density})^{1/3}\) as shown in Figure 4-5. This study answers one of the key questions in the density-step-height model: what is the right formula for the contact height $h_c$? The previous approach for this issue is to use purely empirical fitting [49]. However, the
asperity distribution and the asperity filtering effect are not considered in this study, due to the relative wide feature sizes used in the Xie et al. paper [53]. Instead, the pressure distribution at the up and down area is calculated based on the pad surface bending into the trenches. When the asperity size is comparable to the feature size, not all of the asperity can freely contact the down area. In this case, the pressure distribution at the up and down areas has to be modified to reflect this effect.

![Graph](image)

Figure 4-5. Contact height approximation by linearly fitting to contact wear calculations, giving \((line\ width)/(1-\ effective\ density)^{1/3}\) [53].

Tugbawa contributed pioneering work in integrating the contact wear model with the density-step-height model for chip-scale M1 (first metal layer, or metal 1) copper bulk polishing simulation [49]. His integrated model provides a good solution by balancing computational complexity and efficiency against modeling accuracy. Contact mechanics is used to re-distribute the effective polishing pressure at different locations within the die by considering the long-range topography of the chip, instead of using the nominal pressure computed from the down force. Then the density-step-height model is applied to capture the material removal rate of each patterned region, using the computed effective
envelope pressure calculated using the mechanics model. In this approach, a given chip layout is discretized into fixed size blocks, typically 240×240 µm, and then further each block is discretized into 40×40 µm cells. The contact wear model is used to compute the pressure distribution due to the long-range surface height variation on the large size blocks. The block pressure from the contact wear model is then directly applied to every cell within the block, and the density-step-height dependence is implemented to calculate the material removal rate of the up and down areas of each cell.

Equation 4-8 is the contact wear formulation used in Tugbawa’s model [49]. The implied assumption in using this equation is that the pad is a massive elastic body so that the pad thickness need not be considered, and the wafer is a rigid body without any deformation. Here, ν is Poisson’s ratio of the pad, E is the Young’s modulus of the pad, and A is the layout area. The pressure and displacement of each point are affected by the adjacent pressure and displacement distribution. Based on this equation, the pad deformation can be solved if the pressure distribution is known, or vice versa. The methodology of Yoshida to handle the initial pad deformation and pressure distribution is implemented in Tugbawa’s approach [49, 67]. The initial pad deformation refers to the top (highest) envelope of each block without considering the asperity effect to accommodate the smaller scale topography variation, as shown in Figure 4-6. However, a Fast-Fourier-Transformation (FFT) based algorithm is applied to solve Equation 4-8 rather than the matrix manipulation implemented in Yoshida’s paper [67]. The major advantage of this FFT-based method is computational efficiency. The details of the implementation of this FFT-based method are presented in Tugbawa’s Ph.D. thesis [49].
The integrated framework of the chip-scale CMP model as proposed by Tugbawa has the potential to be extended to cover all polishing steps in the metal 1 and higher level layers. However, several extensions in the CMP model are needed to better integrate the contact wear and density-step-height models, in order to provide a general and uniform simulation platform that is applicable in a consistent fashion for the various polishing steps and process parameters.

The contact wear model coefficient \( C_w = (1 - \nu^2) / \pi E \) and the planarization length both describe CMP polishing pad long-range bending effects at the same time. The competition between the two parameters affects the stability of the integrated model. As mentioned in the previous paragraph, a more physics-based methodology is required to redistribute the block pressure into the cell level, to overcome the conflict between these two parameters and modeling approaches present in Tugbawa’s integrated model. In addition, pad surface asperity affects, not considered in Tugbawa’s model, are needed to improve the accuracy of contact and step height effects. These limitations in the previous model will be addressed in the newly developed model framework, which will be introduced in the following sections.
4.2 New Model Framework

A new CMP model [78] is proposed. The physical motivation and background for the model is first presented, followed by a detailed description of the model implementation. Some limitations in the new model are then discussed; these motivate further extensions to the model which are developed and presented in Section 4.3.

4.2.1 Physical Motivation

As shown in Figure 4-7, the CMP polishing pad can be conceptually divided into two parts, pad base and asperities, which have significantly different properties and contributions to CMP planarization ability. The pad base bends over the relatively long scale and follows the long-range wafer surface topographical variation. However, the asperities are free-standing and deform independently, so they can accommodate local topographical variation. In other words, the pad base is responsible for the global planarization ability, such as recess or oxide loss in large arrays of copper lines, and asperities are responsible for local planarization ability, especially step-height reduction or dishing of individual features. The hierarchical structure is consistent with stacked pads which seek to optimize global and local planarization ability by layered soft and hard pads.

Here, the average envelope of all cells in each 240×240 μm block, rather than the highest position within the block as assumed in the previous model, is used to define the total pad long-range deformation. We assume a linear elastic relationship between block pressures and pad base deformation, and use the contact wear formulation to model long range pad base deformation and block pressures.
For fine features, on the other hand, the additional action of local asperities on the pad is conjectured to dominate the feature-scale polishing behavior. In this case, we can apply a local pattern-density-step-height model, which expresses the relationship between feature step-height and pressure as linearly proportional, up to some “contact height”, $h_c$, at which the recessed feature experiences no polishing pressure. Several factors may contribute to a dependence of the contact height on the line width, line space, or pattern density. The physical basis and form for this dependence remains to be proven; in this version of the model, empirical relationships between contact height and layout parameters are used. Finally, for large features, both the pad bending and pad asperity effects may be important, as shown in Figure 4-8. For features over 240 μm in lateral dimension, for example, we consider the deformation of the pad base into the feature to be non-negligible and use the contact wear model to directly calculate the local effective pressure. At the “cell” dimension (the discretization of the wafer surface within each block), we introduce an additional model component compared to the previous model, considering the asperity effects to apportion the local pressures further. For each cell, the “envelope height” is calculated relative to the block envelope average. Perturbations to the block pressure are then calculated for each of the cells in the block, based on an assumed linear pressure vs. cell envelope height dependence.
Within each cell, the total topography, $TP_{total}$, is defined as the relative height variation of the surface of the wafer. At the same time, the pad topography, $TP_{pad}$, is the relative deformation variation of the pad bulk, and the asperity topography, $TP_{asperity}$, is the average relative thickness variation of the asperity layer.

We assume that $TP_{pad} + TP_{asperity} = TP_{total}$. In this model framework, the pad topography $TP_{pad}$ is assumed to be responsible for the global or long-range topography and the asperity topography $TP_{asperity}$ is responsible for the local topography by assuming that the individual asperity deforms independently like a small spring. Although the asperities might also contribute a small part to the reduction of global topography, such an approximation is close to reality and can significantly decrease the computation load. Another argument for this approximation is that there is no clear boundary between the pad bulk and the pad asperities. The fitted Young's modulus, $E$, from the contact wear model characterizes the reduction of global topography which includes the contribution
from the pad bulk and some global partial contribution from the pad asperities, mostly from the bottom of the asperities. It is likely that only the very top part of the asperity accommodates the local height variation or step height, as discussed in more detail in the next section. The fitted value of the pad bulk Young’s modulus, $E$, is much lower than the independently measured pad bulk Young’s modulus. Several factors can contribute to the discrepancy. One factor is that the global deformation contribution from the asperity is included into the fitted Young’s modulus and lowers the value. Thus this fitted Young’s modulus, $E$, is an effective averaged value appropriate for use in long-range pad response to wafer topography.

4.2.2 Model Implementation

The implementation of the new model, with a three-part modeling approach [78] to integrate contact wear and step height concepts as shown in Figure 5-9, addresses the shortcomings in the previous model. In part A, the block pressure distribution (in 240×240 µm blocks) is calculated using a contact wear model, similar to that used in the previous model. In part B of the new framework, however, a further pressure distribution into 40×40 µm cells of each 240×240 µm block apportions the effective pressure distribution into the higher resolution cells. The wafer surface deviation of each cell from the average block envelope is treated as the cell “envelope height.” The cell pressure deviation from the block pressure is proportional to the cell “envelope height” with an approximate contact height computed from the density-step-height model. Finally, in part C a local pattern-step-height model is used for a rapid calculation of both up and down removal rates of the wafer features. In the new model, however, a local pattern density instead of the effective pattern density (computed using a planarization length) is
used. Thus only the contact wear coefficient characterizes the long-range planarization capability, which avoids the conflict between the planarization length and the contact wear coefficient.

In the new model, a similar model framework is applied for different polishing steps (bulk copper removal, over-polishing and barrier removal) and all data, including high-resolution profilometer scans and film thickness on field regions, are input into the model to further improve the modeling accuracy. Another advantage is that it is possible to directly compare basic process characteristics, such as pad stiffness, of different polishing steps. The modeling methodology [78] is shown in Figure 4-10.
Figure 4-9. Three-part strategy in the new model framework.
In the modeling and simulation, only two geometric variables are used: *envelope* and *step-height*. With the stack information, other values including recess, dishing, and oxide loss can be derived from these two basic variables. This definition simplifies the modeling and simulation, and is suitable in multi-level metallization and random layout cases in which there is no clearly defined field region. As shown in Figure 4-11, the envelope is defined as the absolute distance between the higher part of trench or oxide space and the top of the barrier layer, which is the reference plane. The step height follows the traditional definition, as the distance upward from the surface of the inlaid copper feature to the surface of the surrounding oxide space. Thus a dished copper line has a positive step height, while an over-filled plated fine feature (with accelerated deposition over the copper lines) may have a negative step height. If the feature size is larger than a given value, such as 240 μm, step height is defined to be zero and the
envelope refers to the top of the trench directly as if there were a virtual oxide space, level with the copper. This is consistent with part A in the new model framework, where the envelope height is defined as what the pad base can react to. For a large feature, the pad base will bend into the trench and directly interact with the down area.

Figure 4-11. Definition of envelope and step height in the new model framework.

4.2.3 Shortcomings

The new model framework considers the pressure re-distribution and pressure conservation in each cell in every block, and only local pattern density is used in the step-height model. These improvements make the integration more seamless and the RMS error to model and simulate the chip-scale CMP topography evolution is lowered to 100 Å or so. Although this version of the integrated chip-scale CMP model has been significantly improved, some issues remain to be addressed to further improve the model framework.

First, the asperity height distribution and asperity size distribution have to be considered in the pressure nonlinear redistribution within the blocks. Second, the density-step-height model has to be refined by incorporating some major results by Xie et al.
In particular, the nonlinear exponential relation in the pressure distribution at the feature-scale can be used to improve the model accuracy. The formula for the contact height, $h_c$, can also simplify the model calibration process and improve the model accuracy. As a third change, the linear Preston removal rate also needs to be replaced to reflect the reality of many advanced CMP processes. The model framework must be revised to implement various non-Prestonian removal rate relations. And finally, a method to reflect the asperity filtering effect (introduced in the next section) within the density-step-height model is needed.

4.3 Revised Model Framework

The three-part framework for the chip-scale CMP modeling and simulation introduced in the previous section will be further improved in this section by considering the pad surface morphology or the asperity distribution. Section 4.3.1 reviews the pad asperities height distribution and size distribution, and Section 4.3.2 applies some conclusions or assumptions from Section 4.3.1 to modify the density-step height model. Finally, Section 4.3.3 discusses the methodology to incorporate the details information of the asperities into part B (cell pressure distribution) and part C (step-height effects) in the revised framework. The application of the updated three-part framework with asperity information will be presented in the following Section 4.4, including M1 (metal layer 1) calibration and verification, and M2 (second level metal) simulation.

4.3.1 Pad Surface Morphology

The asperity distribution is strongly affected by the pad conditioning parameters including the conditioning pressure, conditioning time, and abrasive size, as well as by the pad properties including the mechanical stiffness, pore size, pore density, and pore
distribution. Recently, several researchers have identified the critical impact of the pad asperities on CMP polishing behavior, including the pad contact mechanics and slurry flow [79]. A model including the time evolution of the pad height distribution due to abrasive wear is developed to explain the blanket removal rate change during the polishing [80]. Based on these results, it is necessary to include the asperity distribution into the chip-scale model to improve the current three-part framework.

The typical CMP polishing pads used in copper polishing are made from polyurethane foam. Figure 4-12 shows the cross section of IC-1000 and Politex pads from Rohm and Haas [81]. The IC-1000 is a typical hard pad widely used in the industry, especially the overburden copper film polishing, while the Politex pad is a typical soft pad and is widely used for barrier removal.

![Figure 4-12. Rohm and Haas IC-1000 and Politex pads [81].](image)

The asperity roughness has a hierarchical structure as shown in Figure 4-13 [82]. Various metrology tools are used to quantitatively study the asperity distribution, such as optical interferometry, optical microscopy, stylus profilometry [79], confocal laser
scanning microscopy [74] and dual emission laser induced fluorescence (DELIF) [83]. The confocal laser scanning microscope is a powerful tool to study the pad surface morphology or asperity distribution even under pressure, although it cannot at present be used for in-situ pad measurement.

As mentioned in the previous section, Vlassak [68] developed a contact mechanics based model of dishing and erosion in various CMP process. In his model, the asperity height distribution decays exponentially, and Hertz’s formula is used to compute the pressure on each asperity. Based on Equation 4-6, the relation of the relative displacement of the asperity and the applied nominal pressure follows Equation 4-9, as shown in Figure 4-14.

$$\frac{p_s}{p_i} = \exp\left(\frac{\Delta h}{\lambda}\right)$$  \hspace{1cm} (4-9)

where $\lambda$ is the characteristic length in the exponential asperity height probability function. Apparently, this asperity height exponential distribution has to be validated and extracted from measurement data.
Figure 4-14. Relative displacement of the asperity and the applied nominal pressure.

Figure 4-15 compares the representative pad surface data from a fully conditioned pad and a glazed pad and their asperity height distribution [79]. The fully conditioned pad shows a Gaussian or Gaussian-like distribution with some distortion. However, in order to model the probability distribution function (pdf) of the glazed case, a secondary distribution has to be added as shown in Figure 4-15 [79]. The fitting procedure for the asperity height probability distribution function is illustrated in Figure 4-16 [79].

Figure 4-15. IC 1000 pad surface representative line scans and pad height probability distributions [79]. (a) Fully conditioned; (b) Wafer dominated (glazed).

One or two components from the Gaussian, exponentially modified Gaussian (EMG) and Pearson distribution are enough to model the asperity height distribution [78]. The new conditioned pad surface without pad contact usually can be model by a single peak.
The secondary component peak has to be added to model the deformation caused by the pad-wafer contact. A Pearson distribution is used to represent the bulk component as shown in the Figure 4-16. The Gaussian distribution and EMG distribution are more effective to model the secondary component in the low and high deformation cases respectively [79].

![Figure 4-16. Examples of the peak fitting procedure [72]. Left: low deformation; Right: high deformation.](image)

Castillo-Mejia et al. modeled the mechanical behavior of the asperity layer of a polishing pad and its effect on CMP [80]. The asperity height distributions used in the study are shown in Figure 4-17. The different asperity height distribution can be used to explain the material removal rate decay with pad wear. The figure clearly demonstrates the wear impact on the height distribution, and matches with the data in Figure 4-16.
A Greenwood-Williamson contact model is used to describe the asperity interaction with the wafer in their study. One interesting result from this mechanism is that the local contact pressure on the asperity contact with the wafer is higher than the nominal applied down pressure in magnitude, and the average local contact pressure is not sensitive to the nominal down pressure. That means that the real contact area of the asperities is roughly proportional to the nominal down pressure. This result is confirmed by measurement of the pad contact area under pressure using the confocal laser scanning microscope [74], as shown in Figure 4-18. The figure shows that the pad mechanical properties and the surface microstructure affect the contact area. Only 1 % area is contacted for an IC 1000 pad at 3 psi, while the contact area for a softer Politex pad under the same nominal down pressure is three to four times larger. The 30-100 times higher local contact pressure is highly correlated with defect numbers. There are substantial opportunities to optimize the pad surface properties and surface microstructure, and the pad conditioning parameters to achieve high contact area and low contact pressure [81].
The small contact area result from the model and measurement data demonstrates that only the very top of the asperities interacts with the wafer during the polishing. As discussed previously, the tail of the asperity height distribution can be modeled as an exponential distribution. Thus the approximation used in Vlassak's paper [68] can be validated. The characteristic length $\lambda$ can vary significantly, from 0.1 $\mu$m to several microns as shown in Figure 4-17.

Along with the asperity height distribution, the asperity shape and size or contact area distributions can provide important information about the asperity behavior during polishing. Since the wafer topography usually is less than one micron in height and the pad asperity layer thickness is tens of microns, the contact area size and shape distribution under a certain pressure can be treated as independent with the local deformation. Unlike the widely available asperity height distribution data, few publications mention the asperity size and shape distribution under load. Fortunately, the confocal laser scanning microscope [81] can provide this kind of information with good quality. Figure 4-19, for example, shows the contact area of IC 1000 under pressure.
Considering the mathematical simplicity, the flat tail from the asperity hierarchical structure and the irregular contact shape, a lognormal distribution as in Equation 4-10 is used to describe the size distribution of the contact area.

\[
p(x) = \frac{1}{x\sigma\sqrt{2\pi}} \exp\left(-\frac{(\ln x - \mu)^2}{2\sigma^2}\right)
\]

where \(x\) is the asperity contact size, \(\mu\) and \(\sigma\) are characteristic parameters for the lognormal distribution.

Due to the linear relation between the contact area and the nominal down pressure, the asperity contact number distribution has to be transformed into an area weighted cumulative distribution function (cdf). Figure 4-20 shows the asperity contact number probability distribution and the cumulative contact area distribution with \(\mu = 1.3 \mu m\) and \(\sigma = 0.8 \mu m\). The contact area probability distribution is similar to the distribution in Nguyen’s paper [84] except we neglect the flat tail beyond 20 \(\mu m\). The Gaussian pad contact size distribution in Nguyen’s paper is derived on the assumptions of spherical
asperities with a Gaussian height distribution [84]. The real contact shape is likely to be highly irregular due to the pores [81].

\[
p^{\text{area}}(x) = \frac{p(x)x^2}{\int p(x)x^2 \, dx}
\]  

(4-11)

\[
F(x) = \int p^{\text{area}}(x) \, dx = \frac{\int p(x)x^2 \, dx}{\int p(x)x^2 \, dx}
\]  

(4-12)

Figure 4-20. Hypothetical pad asperity contact size distribution.

4.3.2 Modified Density-Step-Height Model

In this section, the modified density-step-height model with the asperity height and size distribution information will be introduced. This updated version of the density-step-
height model will be incorporated into the three-part framework and presented in the next section.

Merchant et al. present a similar density-step-height model with asperity height and size distribution information [84]. Figure 4-21 summarizes their rough pad polishing model. They divided the polishing mechanism into three regimes: classical contact, asperity contact, and asperity filtering. When the trench size is over 1 mm in width, the classical contact mechanism dominates the polishing behavior. The pressure and removal rate on the wide trench bottom are closer to the nominal applied down pressure and blanket removal rate with increasing trench size due to the conformal pad bending into the recess areas. If the asperity size is between 40μm and 1 mm, the dominant mechanism is asperity contact. In this case, the material removal rate is not zero, even though the
predicted pad bending can not reach the bottom of the trench. As shown in Figure 5-21, both pad bending and asperity deformation contribute to the effective pressure and removal rate at the bottom of the trench. This scenario matches with the previous three-part framework described in Section 4.2 [78]. Finally Merchant et al. proposed an asperity filtering mechanism for the feature size less than 40 μm. They assumed that only the asperity with smaller size than the feature size can contribute to the effective pressure on the recessed areas. The wider asperity will be filtered by the trenches and only apply the additional pressure to the raised areas. The pressure distribution is given by the following equations, which can be looked as a variant of the previously introduced density-step-height model.

\[ P_r = \frac{p}{\rho + (1 - \rho)f(w_m)\exp(-\lambda s)} \]  
\[ P_b = \frac{pf(w_m)\exp(-\lambda s)}{\rho + (1 - \rho)f(w_m)\exp(-\lambda s)} \]  

where \( P_r \) and \( P_b \) are pressure on the trench top and bottom respectively, \( p \) is the nominal down pressure, \( \rho \) is pattern density referring to the raised area, \( f(w_m) \) is asperity filtering faction, \( \lambda \) characterizes the asperity height exponential distribution, and \( s \) is the step height. Like the previous density-step-height model, the area-weighted average pressure is equal to the nominal down pressure when the step height is zero. The distribution between the two positions depends on the density and step height. The major difference is that in this model the asperity height and size distribution is considered and the asperity height distribution and filtering function will tune the pressure distribution, rather than use a critical step height \( h_c \) as in the previous density-step-height model. However, the asperities which are smaller than the trench are free standing and should
not affect the pressure distribution in the way described in Equation 4-11 and 4-12. Instead, the unfiltered larger asperity should be responsible for the pressure redistribution on the top of the trench.

Nguyen et al. claim that the asperities with contact size smaller and larger than the feature size produce different effective pressures on the recessed areas [84]. Figure 4-22 illustrates the asperity contact size effect. However, their method requires prohibitive computational load and is not suitable for chip-scale modeling.

![Asperities](image)

Figure 4-22. Contact between asperities with relatively small and large trenches [84].

In the following paragraphs, a modified density-step-height model with the asperity size and height distribution is proposed to solve the problems mentioned above.

Here, the polishing mechanism can be divided into three regimes: pad bending with asperity deformation, asperity deformation, and asperity filtering. When the feature size is over 200 μm in width, the total pad surface topography is from pad bending and asperity deformation. The degree of the pad bending increases with the feature size, and the contact wear model can be used to capture the long range (>200 μm) height variation. This is implemented in part A of the three-part framework, to calculate the pressure distribution on the large blocks. When the feature size is between 40 and 200 μm in width, all of the asperities can reach into the trench and touch the bottom. Most of the
deformation is from asperities and there is no significant contribution from the pad bending. The pressure distribution of the trench bottom and top is constant in this range, as shown in Figure 4-20. From Equation 4-9, the pressure on the upper area and lower area can be calculated by the following equations.

\[ p'' = p^0 \]  \hspace{1cm} (4-15)

\[ p' = p^0 \exp\left(-\frac{s}{\lambda}\right) \]  \hspace{1cm} (4-16)

where \( \lambda \) is the characteristic length in the exponential asperity height probability function and \( p^0 \) is the local pressure when the step height is zero. In this case, the area-weighted average pressure of the patterned areas is not constant and decays exponentially. At the same time, the pressure distribution is not sensitive to the feature size in this range. This pressure distribution relation can be used in part B of the three-step framework to re-distribute the block pressure onto the small cells in each block. Based on the information from Figure 4-20, \( \exp(s/\lambda) = 20 \). Here, the value of \( \lambda \) can be estimated to be one third of the step height, which ranges from 1000 to 3000 Å. When the feature size further decreases below 40 μm in width, the asperity filtering mechanism dominates the polishing behavior. Here, the different impacts of the small and large asperity with respect to the feature size on the pressure distribution on the top and bottom of the trench are discussed and simplified for integration with chip-scale modeling.

If the asperity size is larger than the pitch size (line space + line width), the asperity deformation can be approximated as pad bending, as shown in Figure 4-23. Thus the results for pad bending can be used on the asperity bending. The pressure on the trench top and bottom from the large asperities can be described in the following equations:
\[ p_{\text{top}}^0 / p_0^0 = (1 - F(w_{\text{lw}} + w_{\text{ls}}))[1 - D_{\text{cell}} \exp(-s / h_c)](1 - D_{\text{cell}}) \]  
(4-17)

\[ p_{\text{bottom}}^0 / p_0^0 = (1 - F(w_{\text{lw}} + w_{\text{ls}}))\exp(-s / h_c) \]  
(4-18)

\[ h_c = kw_{\text{lw}}(1 - D_{\text{cell}})^{1/3} \]  
(4-19)

where \( p_{\text{top}}^0 \) and \( p_{\text{bottom}}^0 \) are effective pressure on the trench top and bottom from the asperities larger than the pitch size respectively, \( w_{\text{lw}} \) and \( w_{\text{ls}} \) are the line width and line space respectively, \( (1 - F(w_{\text{lw}} + w_{\text{ls}})) \) is the asperity area fraction for the asperities larger than the pitch size, \( h_c \) is the critical step height, \( D_{\text{cell}} \) is the local cell pattern density, \( k \) is the coefficient for the critical step height and should be inversely proportional to the asperity Young’s Modulus. Usually, the critical step height is less than \( \lambda \). The asperity Young’s modulus is smaller than the bulk materials Young’s Modulus in a magnitude [84].

Figure 4-23. Asperity bending over the whole pitch.

When the asperity size is less than the line width, \( w_{\text{lw}} \), the asperities can contact the bottom of the recessed area and apply the pressure freely without intervention from the pressure on the top of the trenches. Here, the asperity height distribution is assumed to be independent of the contact size distribution. Equation 4-16 can be applied in this case.
When the feature size is between $w_{hs}$ and $w_{hs} + w$, some interaction between the top and bottom is possible, as shown in Figure 4-24. However, this interaction's effect on the bottom is not very significant and is ignored here as a first-order approximation. So the effective pressure on the recessed area from the asperity with size less than the pitch size can be described as in the following equation:

$$p^{1.2}/p^{0} = F(w_{hs} + w_{hs})\exp(-s/\lambda)$$ (4-20)

![Asperity bending over line space.](image)

Figure 4-24. Asperity bending over line space.

However, the interaction effect from the asperities with contact size between $w_{hs}$ and $w_{hs} + w$ cannot be ignored for the pressure on the top of the trench, especially in the high density and small line space case. As shown in Figure 4-24, the asperity can apply a much higher pressure than $p^{0}$ on the small trench top. This case is between the free-contact case and the asperity bending case shown in Figure 4-23. Considering that the weight-averaged cell pressure cannot be larger than $p^{0}$, the pressure on the trench top from the small asperities can be described in the following equation:

$$p^{0.2}/p^{0} = F(w_{hs}) + (F(w_{hs} + w_{hs}) - F(w_{hs}))(1 - D_{cell}\exp(-s/\lambda))/(1 - D_{cell})$$ (4-21)
Then the pressure on the top and bottom of the trenches and the average cell pressure are expressed in the following equations:

\[ p^n = p^{n,1} + p^{n,2} \]  
(4-22)

\[ p^l = p^{l,1} + p^{l,2} \]  
(4-23)

\[ p^{\text{avg}} = p^l D_{cell} + p^n (1 - D_{cell}) \]  
(4-24)

\[ g = \frac{p^{\text{avg}}}{p^0} \]  
(4-25)

where \( g \) is the ratio between the average cell pressure with step height and the pressure with no step height. It is apparent that \( g \) should be greater than 1.

Figure 4-25. Average cell pressure as a function of step height.
Figure 4-25(a) shows the effective pressure trend due to the large asperities as a function of the step height, while Figure 4-25(b) shows the effective pressure trend due to the small asperities. Figure 4-25(c) shows the overall effective pressure on the trench top and bottom and the average cell pressure as a function of the step height. Here, the line width and line space are 5 μm, \( \lambda \) is 1000 Å, and \( k \) is 0.002. The pressure distribution on the trench top and bottom is close to that found with the previous density-step-height model, except for the flat tail and non-conservative average cell pressure due to the asperity filtering effect. By considering the pad morphology, the density-step-height model can more closely conform to the reality of the pad surface structure.

4.3.3 Three-Part Framework

As discussed in the previous section, the three-part framework is an effective methodology for the chip-scale CMP modeling and simulation. The further improved three-part model with the asperity height and contact size distributions is presented in this section.

The new three-part model is similar to the previous three-part model as shown in Figure 4-9. In part A, the block pressure distribution (in 180×180 μm blocks) is calculated by using a contact wear model. Each block is further divided into 20×20 μm cells. The block height, \( T_i \), is the mean value of the average height, \( T_{i,j} \), of the cells in each block. In part B of the three-part framework, a further pressure distribution into 20×20 μm cells of each 180×180 μm block apportions the effective pressure distribution into the higher resolution cells. The pressure re-distribution method is shown in the following equations:

\[
T_i = \overline{T}_{i,j}
\]  

(4-26)
\[ p_{i,j}^0 = S_i \exp \left( \frac{T_{i,j} - T_c}{\lambda} \right) \]  
\[ p_i = p_{i,j}^0 g_{i,j} \]  

where \( S_i \) is the scale factor for block \( i \). The value can be determined from the pressure conservation in each block, as given by Equation 4-28. When \( p_{i,j}^0 \) is calculated, the pressure on the trench top and bottom also can be calculated by the modified density-step-height model in the last section, which is part C in the three-part framework.

![Graph](image)

**Figure 4-26. Cu removal rate vs. down force for different Cu abrasive-free slurry solutions [86].**

After the pressure on the trench top and bottom of each cell across the whole chip is available, the material removal amount can be calculated given the time step and the material removal rate functions. The copper slurry for overburden copper film removal in the test cases is the high-selectivity abrasive-free slurry C430 with strong non-Prestonian behavior, as shown in Figure 4-26, while the barrier slurry is the non-selective abrasive slurry Cu10K which has a Prestonian rate dependence on pressure. The purpose of the abrasive-free slurry is to reduce dishing and erosion dramatically due to no material...
removal at low pressure. Another interesting property for the slurry is the strong
correlation between the chemical concentration level in the slurry and the copper removal
rate, as shown in Figure 4-26. Solution A has the lowest chemical concentration while
solution C has the highest. The non-Prestonian removal rate can be expressed as Equation
4-29.

\[ R = K_r (P - P_{th})^\alpha \nu^\beta \quad \text{for } P > P_{th} \]  

(4-29)

where \( R \) is material removal rate, \( K_r \) is material removal rate Preston coefficient, \( P_{th} \) is
the pressure threshold for polishing, \( \alpha \) and \( \beta \) are the pressure and velocity indices,
respectively. As a further simplification (assuming constant polish velocity), the velocity
dependence can be ignored and the removal rate equation is rewritten as:

\[ R = R_0 \left( \frac{P - P_{th}}{P_0 - P_{th}} \right)^\alpha \quad \text{for } P > P_{th} \]  

(4-30)

where \( R_0 \) is material removal rate at the nominal down force \( P_0 \). The nominal down
forces used in the copper and barrier layer polishing test cases studied here are 3.11 and
2.80 psi, respectively.

4.4 M1 Calibration and Verification

In this section, model calibration results for MIT/SEMATECH 854 M1 test wafers,
including the overburden copper polishing and barrier polishing steps, will be presented.
Based on the initial calibration, a dependence on temperature during the polish step is
considered, and an improved model fit accounting for this effect is described. This is
followed by presentation of verification results for MagnaChip internal mask.
4.4.1 Step 1 Calibration Results

In this subsection, the calibration results for the copper polishing step (step 1) will be summarized with appropriate and necessary explanation. For calibration, we use the available data to extract and fit the CMP model; the measure of goodness is the model fit error. In verification, we use the model extracted from the test layout, to predict the results for a different (e.g. product or other test pattern) layout, where the measure of goodness is model prediction error.

Based on information in the literature, some parameters are specified rather than be set based on model fitting to experimental data. These are:

\[ P_{in} = 10\text{Kpa}, \quad \alpha = 0.75, \quad \mu = 1.3\mu\text{m}, \quad \sigma = 0.8\mu\text{m} \]

Other parameters are fit to the data, using a least squares minimization procedure. The fitted model parameters include:

\[ R_{Cu} = 91 \text{Å/sec}, \quad R_{TaN} = 0.6 \text{Å/sec}, \quad R_{Oxide} = 0.6 \text{Å/sec}, \quad (\text{at 3.11 psi or 21.4 Kpa}) \]

\[ E_{pad} = 80 \text{MPa}, \quad K = 0.003, \quad \lambda = 800 \text{Å} \]

The fitted values of the copper, tantalum nitride and oxide removal rates for the abrasive slurry at the nominal down force 3.11 psi are 91, 0.6 and 0.6 Å/sec. The removal rate data from the blanket wafers are 92, 0.02 and ~0 Å/min, respectively. The differences between the fitted values and data for blanket removal rates for TaN and oxide will be discussed later.

The fitted Young’s modulus for the IC 1000 polishing pad is 80 MPa, which is significantly lower than the value of the bulk material in the literature ranging from 300 to 500 MPa, depending on the sample size and test methods. There are several reasons for this discrepancy. In the contact wear model, the mechanic properties of the top of the pad
bulk determine the pad deformation, rather than the whole thickness of the pad. Due to the wear and damage from the high pressure conditioning, the effective Young’s modulus for the top of the bulk pad can be degraded. In addition, in the three-part framework, the pad bulk deformation is approximated as the global deformation which might include some contribution from the pad asperities. Indeed, there is no clear line to divide the deformation of the pad bulk and the pad asperities, and one would expect that the approximation can further lower the effective Young’s modulus based on the fit to patterned wafer polish data. It is generally found in the CMP model simulation that the fitted Young’s modulus is smaller than the value from the bulk materials.

The RMS (root mean square) fitting errors for different heights and thicknesses in the copper polishing step are listed in Table 4-1. The results for dishing and erosion after copper clearance are less than 100 Å, and are better than the 100-500 Å RMS errors in Tugbawa’s previous model [49]. The apparent accuracy improvement of the three-part framework comes from the hierarchical structure in the polishing mechanism. At the same time, although the accuracy improvement from the incorporation of the pad asperity distributions in the model calibration is insignificant, the updated three-part framework has better flexibility to deal with random layouts. However, there is no improvement for the fitting errors of the step height, array height and field copper thickness in Table 4-1. The lack of additional improvement might be due to significant temperature effects in the copper polishing step; this issue will be discussed further in Section 4.4.2.
Table 4-1. CMP chip-scale modeling error for MIT SEMATECH 854 M1

<table>
<thead>
<tr>
<th></th>
<th>RMS Error (Å)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Step Height</td>
<td>80</td>
</tr>
<tr>
<td>Array Height</td>
<td>167</td>
</tr>
<tr>
<td>Copper Field Thickness</td>
<td>178</td>
</tr>
<tr>
<td>Dishing</td>
<td>85</td>
</tr>
<tr>
<td>Erosion</td>
<td>37</td>
</tr>
</tbody>
</table>

The line scan simulation results with 20 μm resolution are compared to the HRP (high resolution profilometry) scans with 0.2 μm resolution in Figures 4-27 through 4-33. The simulation results after copper clearance capture most of the height variations in the leveled and adjusted HRP scans. However, the prediction results for the initial 50 sec in the copper polishing step are not as accurate as the simulation scans after the copper clearance. Because CMP generally proceeds through to copper clearing and then barrier removal, the smaller error after copper clearance is preferred. Figure 4-33 shows an expanded view of a subset of the array scans pictured in Figure 4-30.
Blue: HRP scan data, Red: grid top, Yellow: grid bottom

Figure 4-27. Array topography simulation for MIT/SEMATECH 854 M1 step 1 polishing at 20 sec (X: μm, Y: Å).
Figure 4-28. Array topography simulation for M1 step 1 at 40 sec (X: \( \mu \text{m} \), Y: \( \AA \)).
4-29. Array topography simulation for M1 step 1 at 50 sec (X: µm, Y: Å).
Figure 4-30. Array topography simulation for M1 step 1 at 100 sec (X: µm, Y: Å).
Figure 4-31. Array topography simulation for M1 step 1 at 130 sec (X: μm, Y: Å).
Figure 4-32. Array topography simulation for M1 step 1 at 160 sec (X: μm, Y: Å).
Figure 4-33. Simulation of four array scans for M1 step 1 at 100 sec (X: μm, Y: Å).

The chip-scale modeling of the evolution of the 3-D topography and step height maps are shown in Figure 4-34 to 4-46. The barrier layer starts to be exposed at 70 sec, and the process is finished at 86 sec. The chip-scale results clearly reveal the good planarization ability of the process, due largely to the pressure threshold for copper removal of the abrasive-free slurry. Hard pads, such as the IC 1000, have better planarization ability and have been widely used in the industry for over a decade. However, the hard pad has some other issues, such as higher post-CMP defects and lower wafer-level removal rate uniformity for wafers with some warp or bow. The most promising advantage of the abrasive-free slurry is that it introduces additional planarization ability, due to high material selectivity and threshold pressure behavior. One side effect is the potential difficulty in completely clearing copper from depressed regions on the chip, and care must be taken to ensure complete clearing. Nevertheless, abrasive-free slurries are considered to be one of the key technologies in copper CMP to reduce dishing, erosion and total copper loss. The step height in the post-electroplating topography is eliminated in the very initial stage of the copper polishing due to no or low removal rate in the
recessed area. After the copper is cleared, the dishing remains almost constant even with extensive over-polishing. This property gives a wide process window in copper CMP and flexibility in process integration. The requirement for the pre-CMP topography can be relaxed and the high bump ECD process also can be acceptable with the use of an abrasive-free slurry. The abrasive-free slurry also widens the design space for the multilevel copper metallization due to low post-CMP height variation.

The final wafer topography is affected not only by the pad mechanical properties, slurry removal rate relation with pressure, and the layout parameters including line width, line space and pattern density, but also by the size of line array regions within the layout. The effect of the size of array regions, especially for high pattern density arrays, on the surface height variation can be seen clearly by comparing the dishing and erosion results of two different array regions, both with 100 μm line width and 1 μm line space. As we saw previously in Figure 4-21, the deformation from the pad bending is highly related to the region size. For the high pattern density region with narrow spaces, the pad bulk almost cannot see the narrow line space due to the asperity with high local deformation ability. In this case, the effective copper feature size is almost equal to the size of the entire array area, rather than a series of separable lines with the given line width of the individual features. Thus, the pad bulk can reach down into the deeper recessed area of the wide high-pattern-density array and apply the effective pressure, which is almost equal to the nominal down force, and produce large apparent copper loss. Indeed, the dishing and erosion of the very wide region with high-pattern density continuously increases during the over-polishing. Although the abrasive-free slurry still can play a role in reducing dishing and erosion in this case, the effect is significantly limited by the pad
bulk bending. Thus a simple local layout design rule for multilevel copper metallization cannot be easily constructed. One approach is extremely conservative design rules [85] that might disallow even relatively safe small regions with large copper and small dielectric features. Ultimately, a potential solution for this issue is to do chip-scale screening at the layout design stage, which is one of key applications of this research.

Figure 4-34. Chip-scale CMP modeling for M1 step 1 at t=0 sec.

Figure 4-35. Chip-scale CMP modeling for M1 step 1 at t=10 sec.
Figure 4-36. Chip-scale CMP modeling for M1 step 1 at t=20 sec.

(a) 3-D topography map (Å)  
(b) Step height map (Å)

Figure 4-37. Chip-scale CMP modeling for M1 step 1 at t=30 sec.

(a) 3-D topography map (Å)  
(b) Step height map (Å)

Figure 4-38. Chip-scale CMP modeling for M1 step 1 at t=40 sec.

(a) 3-D topography map (Å)  
(b) Step height map (Å)
Figure 4-39. Chip-scale CMP modeling for M1 step 1 at t=50 sec.

Figure 4-40. Chip-scale CMP modeling for M1 step 1 at t=60 sec.

Figure 4-41. Chip-scale CMP modeling for M1 step 1 at t=70 sec.
Figure 4-42. Chip-scale CMP modeling for M1 step 1 at t=80 sec.
(a) 3-D topography map (Å)  
(b) Step height map (Å)

Figure 4-43. Chip-scale CMP modeling for M1 step 1 at t=90 sec.
(a) 3-D topography map (Å)  
(b) Step height map (Å)

Figure 4-44. Chip-scale CMP modeling for M1 step 1 at t=100 sec.
(a) 3-D topography map (Å)  
(b) Step height map (Å)
4.4.2 Temperature Effects in Step 1 Polishing

Temperature has a significant impact on the copper CMP process, and recent attention has focused on the temperature dependency of removal rate [87, 88, 89, 90]. Not only can the mechanical properties of the pad and asperities vary with the temperature, but also the abrasive-free slurry removal rates can be highly affected by the temperature variation during the process. As mentioned in the previous sections, the local contact pressure is almost 100 times higher than the nominal down force. The high local pressure can produce a large amount of heat locally and further affect the pad asperity properties.
Due to the chemically-dominate polishing mechanism of the abrasive-free slurry introduced in Chapter 1, temperature plays an important role in the polishing. In addition, the wafer non-uniformity of the removal rate for the abrasive-free slurry can partly be related to the uneven temperature variation. Figure 4-47 shows that the temperature impact on the friction force and removal rate of the abrasive-free slurry C430. It is clear that high temperature will decrease $P_{th}$ and increase $\alpha$ in Equation 4-27, and the copper removal rate vs. down force will become closer to the linear Preston’s equation at high temperature.

![Figure 4-47. Temperature impacts on C430 removal rate and friction force [91].](image)

The model calibration results for the initial copper polishing step are not as good as the results after the overburden copper is cleared. There is no parameter that can be optimized (fit) to achieve good matching in both the initial polishing and over-polishing stages at the same time. It is highly possible that the temperature variation during the polishing process causes this discrepancy. Xie et al. [92] have proposed a model to use an endpoint detection (EPD) method to minimize over-polishing using the STI CMP motor current. The motor current time evolution during STI CMP is found to be related to the
friction force evolution due to short-term and long-term surface roughness, and the differences in the friction coefficients of the two materials (oxide and nitride), as shown in Figure 4-48. The heat generation on the polished wafer is assumed to be proportional to the friction force (and thus to the motor current). Thus, the short-term and long-term surface roughness and the material friction coefficients play a key role in the wafer surface temperature. Similar to STI wafers, the initial copper polishing stage has a much higher short-term and long-term surface roughness than in the time periods just before and after the copper clearance, as seen in the series of time-evolution maps for the chip-scale surface topography presented above. The abrasive-free slurry is highly reactive with the electroplated copper film, and generates a soft copper complex film on the surface, but is inert to the barrier materials with nearly zero polishing rate. Thus, it is expected than the friction coefficient of Ta is smaller than that of copper, so that the short-term and long-term surface roughness dominates the overall friction force. In this case, the friction force variation in Cu polishing can cause wafer surface temperature variation through the time of the polish. Borst [93] took infrared images of the pad surface in polishing and measured the temperature variation of the pad surface just exposed from the carrier. He finds that the \textit{ex situ} temperature variation can reach $\pm 5 \degree C$, and concludes that the temperature variation is caused by the different friction during CMP. The higher surface temperature in the initial polishing stage can impact both the asperity mechanical properties and the removal rate vs. down force of the abrasive-free slurry. Here, a special model calibration step only for the first 50 sec copper polishing is carried out to confirm the analysis.

The fitted model parameters for the first 50 sec of polishing are:
\[ R_{Cu} = 93.5 \ \text{Å/sec, (at 3.11 psi or 21.4 Kpa)} \]

\[ E_{pad} = 80 \text{MPa, } P_{th} = 3 \text{Kpa, } \alpha = 0.85 \]

\[ K = 0.006, \mu = 1.3 \mu m, \sigma = 0.8 \mu m, \lambda = 1500 \text{Å} \]

In comparison to the model parameters fit for the entire bulk copper polishing step, the asperity related parameters and removal rate show a shift consistent with a temperature increase. The decreased \( P_{th} \) and increased \( \alpha \) make the abrasive-free slurry behave more closely to Preston’s equation with a linear rate versus pressure dependence. This tendency matches with the high temperature impact on copper removal rate [91]. In addition, the larger value of \( k \) and \( \lambda \) mean that the asperities are softer and can more easily to reach into the recessed area to apply effective contact pressure. As shown in Table 4-2, the RMS errors for array height, step height and the field copper thickness are reduced in the new model fitting. The topography simulation results for the HRP array scans further support the validity of the new model fitting, accounting for a temperature shift, as shown in Figure 4-49 to Figure 4-51.

Figure 4-48. Endpoint detection motor current of a patterned STI wafer [92].
Table 4-2. CMP chip-scale modeling error for MIT SEMATECH 854 M1 for the initial 50 sec polishing.

<table>
<thead>
<tr>
<th></th>
<th>RMS Error (Å)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Step Height</td>
<td>67</td>
</tr>
<tr>
<td>Array Height</td>
<td>152</td>
</tr>
<tr>
<td>Copper Field Thickness</td>
<td>163</td>
</tr>
</tbody>
</table>
Blue: HRP scan data, Red: grid top, Yellow: grid bottom

Figure 4-49. Array topography simulation for MIT/ SEMATECH 854 M1 step 1 polishing at 20 sec (X: µm, Y: Å).
Figure 4-50. Array topography simulation for MIT/SEamatech 854 M1 step 1 polishing at 40 sec (X: μm, Y: Å).
Figure 4-51. Array topography simulation for MIT/SEMATECH 854 M1 step 1 polishing at 50 sec (X: µm, Y: Å).
The new chip-scale topography simulation results for the initial 50 sec polishing are shown in the following figures, in comparison to the previous results from the overall model fitting. The change in the asperity mechanical properties and the copper removal rate significantly reduces the step-height elimination speed due to the improvement in copper removal at the recessed areas.

Figure 4-52. Chip-scale CMP modeling for M1 step 1 at t=10 sec.

Figure 4-53. Chip-scale CMP modeling for M1 step 1 at t=20 sec.
Figure 4-54 Chip-scale CMP modeling for M1 step 1 at t=30 sec.

Figure 4-55 Chip-scale CMP modeling for M1 step 1 at t=40 sec.

Figure 4-56 Chip-scale CMP modeling for M1 step 1 at t=50 sec.
(a) $E_{pad} = 120$ Mpa and $\lambda = 1500\,\text{Å}$
(b) $E_{\text{pad}} = 80$ Mpa and $\lambda = 800 \, \text{Å}$

4-57 Sensitivity analysis for MIT/SEMATECH 854 M1 step 1 polishing at 20 sec (X: μm, Y: Å).
Figure 4-57 show the impact on the simulation of the array scans by intentionally changing $E_{pad}$ and $\lambda$ away from the optimal fit values, thus providing information on the sensitivity of simulation results to key model parameters. By comparing Figure 4-57 with Figure 4-49 the tendency of each parameter can be summarized and used in the parameter optimization process. The higher value of $E_{pad}$ results in a higher planarization capability of the polishing pad, which will make the post-CMP topography smoother as shown in Figure 4-57 (a). Smaller values of $\lambda$ implies lower removal rates at the recessed areas during polishing, as we see in Array 100/1 of Figure 4-57 (b). The profile shape of structures such as Array 100/1, rather than the RMS errors of dishing and erosion, are sensitive to $E_{pad}$ and $\lambda$. This key structure is used to determine the values of $E_{pad}$ and $\lambda$ during model optimization. Another important fitting parameter, copper removal rate, is sensitive to the RMS error of the overall copper thickness during polishing.

To be specific, in Figure 4-57 (a), we increase the value of $E_{pad}$ by 50% from the optimal fit of Figure 4-49. Qualitatively, we see that the post-CMP topography is smoother on the edges of the arrays with significant oxide and copper loss, such as in the Array 100/1 structure. Quantitatively, the 50% change in $E_{pad}$ results in only a small change in the overall RMS error, from 128 Å to 145 Å. In Figure 4-56 (b), we decrease the value of $\lambda$ by 47% from the optimal fit of Figure 4-49. Qualitatively, as discussed earlier, we see that poorer prediction of the down area polish results. Quantitatively, the 47% change in $\lambda$ again results in a small change in the overall RMS error, from 128 Å to 140 Å. Finally, if we change the blanket removal rate by 2%, we see that the overall RMS error changes moderately, from 128 Å to 135 Å.
4.4.3 Step 1 Verification Results

The previous sections presented the fitting results, where the parameters of the CMP model are extracted such that the error between measurement and model prediction are minimized. In order to test how well the extracted model does in predicting results for a different layout, the extracted model is applied to a second test mask, a MagnaChip internal CMP verification test pattern.

Figure 4-58 compares the recess data after baseline step 1 polishing to the chip-scale simulation results for the MagnaChip internal verification mask based on the calibrated copper polishing model. Although the general trends are captured, there are apparent discrepancies for the features with high pattern density and narrow line space, such as 1.5/0.18 and 1.5/0.3 (line width / line space areas). The exceptionally large recess for these structures cannot be justified by the copper polishing process itself, even if all of the nominal down force is applied on the narrow line space. A possible explanation is that other pattern dependencies, such as barrier layer deposition and pattern etching, are interacting with the CMP process. The deposited barrier thickness likely varies with the line space, especially for deep sub-micron line space. However, the variation of the barrier layer deposition is limited and cannot explain the over 400 Å recess for the structure with 1.5/0.18 line width and line space. In this case, the pattern dependencies in the feature etching are more likely to be the major reason, and are a better match to the discrepancy between the data and simulation results. In particular, the features with high pattern density and very narrow line space present the worst situation for etching. Indeed, the fitted polishing removal rates of TaN and oxide are several times higher than the blanket removal rates, in order to compensate for the additional recess from other pattern...
dependencies in the calibration structures of 100/1 and 9/1 line width and line space. In the verification mask, the fine arrays, such as 1.5/0.18 and 1.5/0.3, are more sensitive to etch pattern dependencies. Thus, the additional recess from these arrays cannot be completely compensated by the fitted polish removal rates of Ta and oxide.

The chip-scale modeling of the evolution of the 3-D topography and step height for the MagnaChip internal mask are shown in Figures 4-59 to 4-67. The barrier layer starts to be exposed at 80 sec, and the process is finished at 92 sec. These times are significantly behind the copper clearance times of the MIT/SEMATECH 854 M1 step 1. The mask design can substantially impact the post-electroplating topography, and thus the timing of the copper clearance and over-polishing. There is a trade-off between wafer-level copper clearance and dishing/erosion control. This observation further highlights the limits of the simple layout design rules, which are not able to account for chip design interaction with wafer-level clearing and endpointing.
Figure 4-58. Envelope data and simulation results for MagnaChip mask step 1.

(a) 3-D topography map (Å)  (b) Effective step height map (Å)

Figure 4-59. Chip-scale CMP modeling for verification mask step 1 at t=0 sec.
Figure 4-60. Chip-scale CMP modeling for verification mask step 1 at t=10 sec.

Figure 4-61. Chip-scale CMP modeling for verification mask step 1 at t=20 sec.

Figure 4-62. Chip-scale CMP modeling for verification mask step 1 at t=30 sec.
Figure 4-63. Chip-scale CMP modeling for verification mask step 1 at t=50 sec.

Figure 4-64. Chip-scale CMP modeling for verification mask step 1 at t=70 sec.

Figure 4-65. Chip-scale CMP modeling for verification mask step 1 at t=80 sec.
4.4.4 Step 2 Calibration

In this section, the calibration results for the barrier layer polishing step (step 2) will be summarized. Here, the removal rate of the Cu10K2 slurry follows the Preston equation, with a removal rate that is linearly proportional to the pressure.

Figures 4-68 and 4-69 show the HRP array scan data for MIT/SEMATECH 854 M1 step 2 polishing at 30 sec and 50 sec, respectively. The sidewall of the barrier layer of the single lines and the end of the array regions is protruded, and almost no removal is observed at 30 sec in the barrier layer polishing. Most of the protrusions will be
eliminated before 50 sec polishing. A possible explanation for this phenomenon is due to galvanic corrosion. The different electrochemical potential of the copper and tantalum nitride at the interface can produce different dissolution rates of the two materials, such that the barrier layer material is protected by the additional electron loss of the copper [94].

In order to capture the impact of the barrier sidewall protrusion in the material removal, the removal rates of the TaN and oxide have to be modified in the initial 50 sec of polishing. This removal rate modification will introduce a kind of pattern dependency in the removal rates of the TaN and oxide. Although most impacts of the protrusions are reflected in the removal rate modification, there is some deviation in the modeling and simulation.

The fitted model parameters for the step 2 polishing are:

\[ R_{Cu} = 13.5 \text{ Å/sec}, \ R_{TaN} = 18 \text{ Å/sec}, \ R_{Oxide} = 12.7 \text{ Å/sec}, \ \text{at 2.80psi or 19.3Kpa} \]

\[ E_{pad} = 150Mpa, \ K = 0.003, \ \lambda = 800\text{Å} \]

The extracted removal rates of the three materials are reasonable and close to the blanket rates. The resulting RMS errors for the oxide erosion and field oxide loss are 22 and 77 Å, respectively. Since the dishing data are too weak to be reliably extracted from the HRP scans, the RMS error for the dishing is ignored; this is consistent with our observation for the fitted data, where the simulated dishing results are less than 30 Å. The array HRP simulation results for 30, 50 and 70 sec polishing are shown in Figures 4-70 through 4-72.
Figure 4-68. HRRP array scan data for MIT/SEMTECH 854 M1 step 2 polishing at 30 sec (X: μm, Y: Å).
Figure 4-69. HRP array scan data for MIT/SEMATECH 854 M1 step 2 polishing at 50 sec (X: μm, Y: Å).
Blue: HRP scan data, Red: grid top, Yellow: grid bottom

Figure 4-70. Array topography simulation for MIT/ SEMATECH 854 M1 step 2 polishing at 30 sec (X: μm, Y: Å).
Figure 4-71. Array topography simulation for MIT/SEMA TECH 854 M1 step 2 polishing at 50 sec (X: μm, Y: Å).
Figure 4-72. Array topography simulation for MIT/SEMATECH 854 M1 step 2 polishing at 70 sec (X: μm, Y: Å).
Figures 4-73 through 4-76 show the chip-scale topography evolution in the barrier polishing step for the MIT/SEMATech 854 M1 wafers. The results clearly illustrate that the non-selective slurry removes most of the step-height quickly, and the copper and oxide loss is directly related to the copper loss and pattern density after the copper polishing step. The key to reduce copper and oxide loss and improve the post-CMP topography evenness is to reduce the copper dishing in the previous polishing step.

Figure 4-73. Chip-scale CMP modeling for M1 step 2 at t=0 sec.

Figure 4-74. Chip-scale CMP modeling for M1 step 2 at t=30 sec.
4.4.5 Step 2 Verification Results

The model, which was fit using the MIT/SEMatECH 854 mask data, is next applied to the prediction of results for the MagnaChip internal mask. Figure 4-77 compares the recess data after baseline step 2 polishing to the chip-scale simulation results for the MagnaChip internal verification mask, based on the calibrated copper polishing model. Although the general trends in the surface profiles are captured, several factors contribute to errors. First, there are apparent discrepancies for the features with high pattern density and narrow line space, such as 1.5/0.18 and 1.5/0.3 line width and line space (µm). This
error stems from the simulation error in the copper polishing step. As mentioned before, the barrier sidewall obstructs the material removal and makes it difficult to accurately model and simulate the barrier polishing process with the simple removal rate modification. The array regions used in the verification mask are much smaller than the wide array regions used in the model calibration mask, which are up to 2 mm in size. This introduces substantial difficulty in data extraction and model systematic error.

![Figure 4-77. Envelope data and simulation results for MagnaChip mask step 2](image)

The chip-scale modeling of the evolution of the 3-D topography and step height for the barrier removal step using the MagnaChip internal mask is shown in Figures 4-78 to 4-80. Again, the copper and oxide loss is directly related to the copper loss and pattern density after the copper polishing step.
Figure 4-78. Chip-scale CMP modeling for verification mask step 2 at t=0 sec.

Figure 4-79. Chip-scale CMP modeling for verification mask step 2 at t=30 sec.

Figure 4-80. Chip-scale CMP modeling for verification mask step 2 at t=50 sec.
4.5 M2 Simulation

Like the method used in the chip-scale ECD model, the underlying topography for M2 is simplified by using one variable, average surface height, instead of usual two variables, envelope and step height. The small dishing and cell size make this approximation reasonable and easy to handle in multi-level modeling and simulation. Although the dishing for some arrays, such as the 9/1 line width / line space array (μm), is noticeable, the non-selective barrier slurry will remove this underlying height variation in an individual cell.

Figures 4-81 through 4-83 compare the HRP scan data from the MIT/SEMATECH 854 second level (M2) test mask with the simulated results for the M2 copper polishing step at 100, 130 and 160 sec, based on the calibrated M1 model. The M2 arrays shown in the following figures are 0.5/0.5 (μm) and the underlying M1 array patterns are indicated in the labels of each sub-plot. The copper removal rate is adjusted to 103 Å/sec, in order to follow the absolute M2 copper field thickness data. As discussed previously, temperature and pad asperities can impact the copper removal rate of the abrasive-free slurry.

There are two major sources of the observed simulation error: underlying topography simulation error, and wafer-to-wafer non-uniformity. As mentioned in the last section, the underlying topography simplification ignores the step height for arrays with narrow spaces, including the 9/1, 100/1 and related structure. In order to test the M2 model, in the M1 processing for just these wafers, a process was chosen that intentionally exaggerated the underlying M1 topography variation, in order to capture the multi-level impacts.
Blue: HRP scan data, Red: grid top, Yellow: grid bottom

Figure 4-81. Array topography simulation for MIT/ SEMATECH 854 M2 step 1 polishing at 100 sec (X: μm, Y: Å).
Figure 4-82. Array topography simulation for MIT/SEMATECH 854 M2 step 1 polishing at 130 sec (X: μm, Y: Å).
Figure 4-83. Array topography simulation for MIT/SEMATECH 854 M2 step 1 polishing at 160 sec (X: μm, Y: Å).
Figure 4-84 shows the initial topography for M2 polishing. The impacts of the underlying topography, as shown in Figure 4-85, can be classified as the local height variation impact occurring laterally over tens of microns to several hundred microns, and the long-range height variation impact which occur at the mm scale across the chip. The direct overlap of the wide features (> 40 µm) of M1 and M2 patterns leaves 6000 to 8000 Å local height variation. However, this substantial local height variation is almost entirely eliminated during the copper polishing, for the wide directly overlapped features except the large pad regions or the wide array with large line width and low-pattern-density, as shown the chip-scale post-CMP topography simulation in Figure 4-86. As mentioned in the early part of this chapter, the dishing of the features from 40 to 200 µm are only affected by the asperity deformation rather than the pad bulk bending if the pattern density is not low. In this case, the pressure threshold of the abrasive-free slurry limits the dishing maximum to below 800 Å even for the direct over-lapped wide features. However, if the feature size is large enough, as in the pad regions, or the size of the array with larger line width and low-pattern-density is large enough (e.g., over 1 mm), then the pad bulk long-range bending will have a significant impact on the topography and the post-CMP topography is directly related to the size of the area. Due to the high-selectivity of the abrasive-free slurry, the long-range height variation of the underlying topography is almost a simple addition to the height variation from the M2 pattern.

In summary, the multi-level CMP effect comes from the long-range underlying topography rather than from the local height, in the case of abrasive-free slurry polishing. Due to the dishing sensitivity to over-polishing time for the traditional copper slurry, the local underlying height variation has to be considered in the multi-level case.
Figure 4-84. Chip-scale CMP modeling for MIT/SEMATECH 854 M2 step 1 at 0 sec.

Figure 4-85. Underlying topography for MIT/SEMATECH 854 M2.
Figure 4-86. Chip-scale CMP modeling for MIT/SEMATECH 854 M2 step 1 at 100 sec.

Figure 4-87 shows the remaining copper thickness map over the regions of the chip which do not have an M2 pattern. The exaggerated local height variation from the wide M1 feature makes it difficult to clear the overburden copper in the M2 copper polishing. The dishing limit mechanism due to the pressure threshold of the abrasive-free slurry is the key reason for this copper clearance problem. If the same copper polishing process (with stronger or more exaggerated dishing and erosion) used in M1 is applied in M2, there is almost no copper clearance problem. In addition, the subsequent barrier polishing step can be extended to further correct this issue, at the expense of increased oxide and copper total loss.
Figures 4-87, 4-88, and 4-89 compare the HRP scan data with the simulated results for M2 barrier layer polishing step at 30 and 60 sec, based on the calibrated M1 model. There are two major sources of simulation error: the previous topography simulation error and the wafer-to-wafer non-uniformity. The simulation error from the underlying topography simplification which arises from ignoring the dishing step height is mostly corrected in the barrier layer polishing, since the non-selective slurry averages out the step height effect.
Blue: HRP scan data, Red: grid top, Yellow: grid bottom

Figure 4-88. Array topography simulation for MIT/SEATECH 854 M2 step 2 polishing at 30 sec (X: μm, Y: Å).
Figure 4-89. Array topography simulation for MIT/SEMATECH 854 M2 step 2 polishing at 60 sec (X: μm, Y: Å).
Figure 4-90 shows the chip-scale post-CMP topography simulation of the initial topography at the beginning of the M2 barrier layer polishing. The surface height variation after the copper polishing, which includes the contribution for the underlying M1 topography, is reduced after the barrier layer polishing.

![Figure 4-90](image)

(a) Envelope map (Å)  
(b) Step height map (Å)

Figure 4-90. Chip-scale CMP modeling for MIT/SEMATECH 854 M2 step 2 at 60 sec.

Figure 4-91 shows the remaining copper and barrier thickness map over the area without M2 pattern. Although the clearance problem has been improved a little, the exaggerated local height variation in the underlying topography still causes difficulty in the M2 polish. Although the clearance problem can be mostly eliminated by using the M1 process conditions, process variation can still result in a potential clearance problem, even with a longer barrier layer polishing time. In order to improve the process robustness, the optimization in the layout pattern and dummy design needs to be implemented; this is discussed further in Chapter 5.
4.6 Conclusions

A new chip-scale CMP model incorporating asperity height and contact area size distributions is developed, and extended to the multi-level case. The model RMS error is reduced to 100 Å without significant sacrifice in computation efficacy. For the verification mask, the time for the step 1 and step 2 simulations by using the new model are 60 and 80 min, respectively. A three-part modeling framework, consisting of block, cell, and feature-level parts, has been developed that matches the hierarchical scale structure in the CMP process. The contact-wear model and the modified density-step-height model are seamlessly integrated in the three-part framework. Further, the different steps (bulk copper polish, copper over-polish, and barrier polish) at different metal levels, as well as the ECD process are seamlessly integrated.

Two critical issues are indicated in the CMP model calibration and verification. First, the temperature effect in the pad properties and slurry removal rate during copper
polishing cannot be ignored. Second, other pattern dependencies from other process in
the multi-level copper metallization, such as etching and deposition, have to be modeled
and incorporated into the integrated ECD/CMP model to fully capture the surface height
variation in random layouts. The integrated ECD/CMP model can be the first step to
develop a complete pattern dependency simulator for BEOL.

Generally, the long-range height variation in the underlying topography can be
transfered into the higher level topography, while the local height variation in the
underlying topography can be ignored if the size of the overlapping higher level patterns
is over 20 to 40 μm, due to the dishing limit effect of the abrasive-free slurry.

Based on these simulation results, a few general guidelines for layout design can be
identified to reduce the dishing, erosion, and multi-level effects. First, one should avoid
wide features, especially large arrays with large line width and low pattern-density.
Second, one should avoid the overlap of arrays with large line width and low pattern-
density or pad regions in the subsequent levels. Third, the overlapping features in the next
higher metal level should be (and generally are) larger than the underlying patterns.
However, the ultimate strategy to deal with dishing and erosion and multi-level effects is
to screen the layout with chip-scale modeling and simulation, and to use dummy fills.
The dummy fill design and ECD/CMP co-optimization will be discussed in the next
chapter as an application of the integrated ECD/CMP model.
Chapter 5

ECD/CMP Simulation and In-Pattern Dummy Fills

With the improved ECD/CMP chip-scale model, multilevel copper metallization co-optimization is viable and promising for process integration and other applications, such as layout optimization and dummy fill design. In this chapter, in-pattern dummy fills and their impacts on the copper metallization are discussed as one example for the application of the ECD/CMP integrated model.

First, the motivation for the strategy of low down-force conventional CMP with in-pattern dummy fills and reduced copper film thickness is discussed in Section 5.1. Then the pseudo-process conditions for ECD and CMP co-optimization are proposed in Section 5.2, based on the calibrated ECD/CMP models and the latest developments in tool design and consumables improvement. This is followed in Section 5.3 by the introduction of design guidelines for in-pattern dummy fills, and description of several possible dummy fill patterns. The post copper polishing topography is simulated by using the calibrated ECD/CMP models with the pseudo-process conditions and the various dummy designs; these results are presented in Section 5.4. Finally, Section 5.5 concludes with a discussion of other potential ECD/CMP integrated model applications and a new strategy to extend the conventional CMP process.
5.1 Motivations

Low down-force polishing is one of the key requirements for an advanced CMP process compatible with low-K materials. The industry is facing significant challenges to develop the planarization technology for the increasingly stringent specifications in the 65 nm technology node and below. In order to extend the conventional CMP tools into the future technology nodes with performance in high process throughput and low dishing and erosion that is comparable or better than with promising processes such as ECMP for copper planarization, a new strategy involving ECD/CMP process co-optimization and in-pattern dummy fill design is proposed.

Combined with high-velocity low-down-force tools and advanced consumables such as abrasive-free slurries, a reduced electroplated copper thickness, down to about 1 to 1.5 times the thickness of the trench depth, can help to improve the process throughput and yield. The lower limit for the copper electroplated copper thickness would be the trench depth, in order to handle wide features that are more than twice the width of the electroplated copper thickness. However, the pattern dependent problem of dishing and erosion would be worsened, since the reduced copper thickness leaves a shorter polishing time budget to remove the uneven incoming topography. Additional methods need to be implemented to compensate for this the low polishing time budget, and to further improve the post-CMP surface evenness.

Dummy filling is a common method to deal with a variety of pattern dependencies in a number of different processes, including etching, CMP and film depositions. The most commonly used method in CMP is to insert metal dummy in the wide field regions to reduce the range of layout pattern density across the chip, and thereby reduce the range in
post-polish topography. Here, this type of dummy fill is referred to as “between-pattern” dummy fills. Figure 5-1(b) shows an example implementation of between-pattern dummy fills sitting beside a wide-feature high-pattern-density region.

(a) Without dummy fills

(b) With between-pattern dummy fills

(c) With in-pattern dummy fills

Figure 5-1. Schematic showing traditional between-pattern and in-pattern dummy fills. (Left) Cross sectional view. (Right) Top down view.

The basic idea for the between-pattern dummy fills is to increase the fraction of metal area in the field regions, which induces dishing and increases erosion in these areas during the bulk copper polishing step; then, in the subsequent barrier removal step a low-selectivity slurry is used which planarizes any raised dielectric to improve the nonplanarities introduced by the between-pattern dummy structures, as well as to improve or reduce any additional topography induced on nearby active (non-dummy) patterned structures. Thus improved surface planarity is achieved, but at the price of sacrificing additional copper and dielectric thickness.
The sacrificial or “subtractive” dummy fill strategy is not an ideal way to even the topography, suffering from wasted process time and additional cost. A direct method to decrease the copper (and oxide) loss in the wide features in the copper polishing step, in an “additive” strategy, would be advantageous. Another problem for the between-pattern dummy fills is their limited effectiveness with the state-of-art high-selectivity low-dishing-and-erosion copper slurries, like the abrasive-free slurries, as shown in Figure 5-1(b). Generally, the size of the metal dummy fills is small, at or below the micron scale, to fill up the irregular shape of the field regions and avoid introducing additional unevenness due to dishing within the fill features. In an advanced copper polishing process, one would expect the copper dishing loss for the small between-pattern dummy fills is much smaller than that which exists in the targeted high-dishing and/or high-erosion regions. At the same time, the barrier layer above the dielectric in the dummy filled regions can be considered to serve as an effective polishing stop, and prevent or reduce the degree of dielectric loss in these regions. Thus the between-pattern dummy fills will ultimately have only a very weak impact on the final topography remaining after barrier layer removal: they are not effective at preventing the dishing which occurs in large copper features or the erosion which occurs in dense fine line array regions. In-pattern dummy fills transform the wide features into (effective) small features to take full advantage of the abrasive-free slurry.

In contrast, the use of “in-pattern” dummy fills can realize an additive strategy, increasing the thickness of copper features without sacrificing copper and dielectric thickness in order to improve the final topography effectively. Instead of putting dummy fill in the non-patterned field regions as is conventionally done with between-pattern
approaches, the in-pattern dummy fill is inserted into the wide patterns with dimensions over 10 µm, as shown in Figure 5-1(c). This in-pattern dummy fill strategy is also known as “slotting” or “cheesing.”

It is well known that wide features have large dishing, and high-pattern density regions have significant erosion. In many cases, the high-pattern density regions also have wide features. A key approach which can reduce or limit the post-CMP topography is to decrease the copper loss in the wide features. The low down-force polishing processes, for example, reduce the pad bending in the wide features and result in lower dishing and erosion. The abrasive-free slurry can effectively reduce the copper dishing in small features, due to the pressure threshold in the removal rate and the very high selectivity to barrier and dielectric. However, the abrasive free slurry is not as effective at preventing dishing in very wide features.

A second method to reduce the copper loss in wide features is to improve the post-electroplating topography planarity, as is achieved in the plating bump reduction using a leveler additive, and then to strictly control the over-polishing time using such approaches as an in-situ copper residual monitor and closed-loop process control. This reduces copper loss by minimizing the excess polishing time needed after clearing the copper, which is the interval in which most dishing occurs.

Promoting copper deposition in the wide trenches is another way to reduce the post-CMP dishing loss in the wide features. In Figure 5-2(b), the in-pattern dummy fills are placed in the wide features. The additional surface area for electroplating additive absorption enhances the copper deposition rate in the wide features, and results a more uniform post-electroplating topography across the chip which the CMP process must deal
with. The additional copper deposition can also compensate for the shorter polishing time budget resulting from a reduced overall electroplated copper film thickness, and decrease the over-polishing time that would otherwise be suffered by regions with wide features. Perhaps most importantly, the in-pattern dummy fill can decrease the effective line width and thus limit the pad bending into these wide features during CMP. The protruding dielectric posts or structures which make up the in-pattern dummy fills take on a large part of the pressure which would normally bear on the wide copper features. The copper loss with in-pattern dummy fill can thus be reduced significantly, as illustrated schematically Figure 5-2(b). The careful design of in-pattern dummy fills, optimized to take advantage of both electroplating and CMP benefits, has the potential to substantially reduce the large copper loss due to dishing in wide patterned features.

An approach also focusing on in-pattern dummy fills has been used to generate a more planar post-electroplating topography, to compensate for the inherently low planarization ability of electropolishing. In the approach developed by ACM Research [24], both dummy metal (between-pattern dummy fills) in wide field areas and dielectric posts in wide features (in-pattern dummy fills) are added to improve the pre-polishing incoming wafer surface planarity. Their strategy is to minimize the pattern density range across the whole chip (increasing the copper pattern density in field regions, and decreasing the copper pattern density in wide features) to achieve a more uniform post-plating topography for the subsequent electropolishing step. One disadvantage of this approach is the large number of inserted dummy fills, which can be a challenge in the layout and mask generation systems.
In contrast, the in-pattern dummy fills in the proposed method is implemented to not only increase the planarity of the post-electroplating topography, but also to partly compensate for the smaller polishing time budget in using thinner copper films, and to address the key copper thickness loss mechanism due to dishing in CMP. The focus is not purely on pattern density, but rather to also insert in-pattern structures to distribute the down force and limit copper removal in the recessed area by reducing the pad bulk bending and pad asperity contact in CMP. Finally, the in-pattern dummy fill strategy is leveraged by the application of the integrated ECD/CMP chip-scale simulation capability, when enables the optimization of the dummy fill based on the joint effects in both ECD and CMP, rather than use minimal pattern density range as a stand-in for final post-CMP topography improvement. More details about the proposed dummy fill design are presented in Section 5-3, after introduction in the next section of a “synthetic” or hypothetical ECD/CMP process that will be the basis for demonstration of the new dummy fill evaluation.

5.2 Pseudo-Processes

The calibrated copper electroplating and copper CMP models presented in Chapters 3 and 4 serve as the starting point for the co-optimization of these processes and the implementation of in-pattern dummy fill strategies. However, we modify the model parameters to reflect expected process and consumable improvements in future advanced EDC and CMP processes. The tunable pseudo-processes using the integrated ECD/CMP model provide the capability to search for effective process windows and to optimize the consumable properties. Although there are likely to be some differences from the processes conditions ultimately available in future ECD and CMP technology, the
simulation results can provide insight for future research and development, and enable evaluation of dummy fill strategies for future technology nodes.

The model parameters used in the pseudo ECD process are as follows:

\[ K = 15, \; k_1 = 0.035 \; s^{-1}, \; k_2 = 0.005 \; s^{-1}, \; \theta_{acc,eq} = 0.05 \]
\[ L_{ECD} = 1200 \; \mu m, \; \alpha = 0.3 \]

The only difference from the calibrated model parameters in Chapter 3 is the smaller value of the inter-feature cupric ion depletion index, \( \alpha \). The adjustment is consistent with more recent industrial process data. The assumed copper seed layer and electroplated copper thickness are 1000 Å and 4000 Å, respectively. An important note is that this
electroplated thickness is substantially decreased compared to the thickness in the experimental electroplating process of Chapter 3 (which was 8500 Å). A goal of this chapter is to also show that the plating thickness can be decreased through the use of in-pattern dummy structures.

Two copper slurry removal relations are assumed in the co-optimization. The first slurry is assumed to be similar to a typical abrasive-free copper slurry with the following removal rate dependence:

\[
R = R_{Cu} \left( \frac{P - P_{th}}{P_0 - P_{th}} \right)^{\alpha} \quad \text{for } P > P_{th}
\]

(5-1)

where \( P_{th} = 0.75 \) psi, \( \alpha = 0.9 \), and \( R_{Cu} = 45 \) Å/sec (at 1.5 psi). The second slurry is assumed to behave similar to a conventional abrasive copper slurry with a Prestonian removal rate. The copper removal rate under nominal down force is \( R_{Cu} = 45 \) Å/sec (at 1.5 psi). Both the removal rates for TaN and dielectric of the two slurries are assumed to follow Preston’s equation, with a rate of 0.4 Å/sec at the nominal down force of 1.5 psi.

The values of the other calibrated copper CMP model parameters are applied in the pseudo copper polishing processes. The removal rate or the down force can be increased or decreased with the use of a higher linear relative velocity during polishing and with improvements in the pad and slurry.

In order to simplify the simulation process, the final topography after barrier polishing is not discussed since the usual low-selective barrier slurry will not significantly change the topography after copper polishing. As for the non-low-selective barrier slurry, especially with low-K materials, the dummy fill design will be different and discussed in the later section.

263
5.3 *In-Pattern Dummy Fills*

A simple in-pattern dummy fill approach is to use narrow oxide slots or lines as shown in Figure 5-3. However, there are several problems with this simple and straightforward design. This design can leave periodic copper loss channels as shown in the figure, and can overlap with lower level topography resulting in accumulation due to multilevel effects. In addition, this approach essentially divides a wide line into multiple separate and poorly interconnected lines. In order to minimize the electrical impact on the effective line width for the wide feature, and to maximize the enhancement of copper deposition within the wide features, narrow oxide structures can be laid into the wide trenches. However, narrow oxide structures in a high pattern density region (as in wide copper features) can have significant oxide erosion after polishing, and also suffer from problems in the barrier layer deposition and etching pattern dependencies as discussed in the previous chapters. Another concern is that the in-pattern dummy fills might introduce some additional interconnect RC delay, by reducing the equivalent copper wire width (cross-sectional area) which increases the wire resistance.
In order to keep the copper in the wide trench interconnected and make the post-CMP surface height more randomized spatially, a second in-pattern dummy fill design is proposed in Figure 5-4. In this design, the sites where the copper loss is maximal are more evenly distributed, and the long copper channels are avoided. At the same time, the electric current flow in the wide features is interconnected. If the dishing reduction (increased height) offsets the line width portion lost to the dummy dielectric (decreased wire width), no net wire cross sectional area is lost, and no RC delay results from the in-pattern dummy fills. However, in this design the narrow dielectric dummy fills within the wide features still face the potential for significant erosion loss and are subject to other problems such as barrier layer deposition nonuniformity and etching pattern dependency.
A further improved in-pattern dummy fill design, as shown in Figure 5-5, aims to deal with the disadvantages in the previous design by adding major pillar structures in each of the inserted dielectric line segments. The relatively large pillar structures will have good barrier layer coverage to protect the underlying dielectric and should be relatively insensitive to etching pattern dependencies. At the same time, the major pillars will more effectively distribute the down force and reinforce the nearby slots or fins of the in-pattern dummy fills during the copper overpolishing step, and limit the pad bulk bending and restrict the ability of the asperities to apply pressure on the recessed areas.

In order to make the major pillars compatible with the previous dummy fill structure and limit the impact on electric performance, there are a number of requirements for the pillar’s size, interval distance and shape. The elliptical shape of the major pillars avoids sharp corners which might impede electric current flow. The size specification of the
proposed in-pattern dummy fills is shown in Figure 5-3. Here, the line segment length, \( d_2 \) is twice the interval distance between the line segments, \( d_1 \). In the following simulation cases, the value of \( d_1 \) is set to 3, 4, or 5 \( \mu m \). This ensures that the smallest copper grain size is over 1 \( \mu m \) to avoid possible copper film resistivity increases due to the copper grain size.

![Diagram of in-pattern dummy fill design #3.](image)

Figure 5-5. In-pattern dummy fill design #3.

A variant of the third dummy fill design is shown in Figure 5-6, which seeks to further improve the surface topography and reduce the amount of copper loss. However, this design might have penalties in further impacting the wire area and electrical performance. Thus, the third dummy fill design is used in the following ECD/CMP co-optimization as the standardized pattern, as a tradeoff between topography improvement and simplicity.
Table 5-1 shows the impact on the extracted effective line width, area fraction, and pattern perimeters of the in-pattern dummy fills with different size specifications ($d_1$ and $d_2$) based on in-pattern dummy fill design #3. Here, the area fraction of the in-pattern dummy fills refers to the local “pattern density” of inserted oxide structures. These area fractions serve to directly offset the copper pattern density within the wide features. We note that each successive fill pattern design in Table 5-1 increases the dielectric area fraction, from 7.24% for design #3-A, to 14.55% for design #3-C.

Table 5-1. Impacts of size specifications of in-pattern dummy fills.

<table>
<thead>
<tr>
<th>#</th>
<th>$d_1$ (μm)</th>
<th>$d_2$ (μm)</th>
<th>Effective Line Width (μm)</th>
<th>Additional Area Fraction (%)</th>
<th>Additional Pattern Perimeter in Cell (μm)</th>
</tr>
</thead>
<tbody>
<tr>
<td>3-A</td>
<td>5</td>
<td>10</td>
<td>5</td>
<td>7.24</td>
<td>163</td>
</tr>
<tr>
<td>3-B</td>
<td>4</td>
<td>8</td>
<td>4</td>
<td>9.75</td>
<td>205</td>
</tr>
<tr>
<td>3-C</td>
<td>3</td>
<td>6</td>
<td>3</td>
<td>14.55</td>
<td>275</td>
</tr>
</tbody>
</table>
5.4 Improvements Resulting from In-Pattern Dummy Fill

In this section, we consider the improvements resulting from the addition of various in-pattern dummy fills. The impact on the layout alone is first considered in Section 5.3.1. In Section 5.3.2, we then examine the improvements in the post-electroplated copper thicknesses resulting from the in-pattern dummy fills, based on ECD simulation. In Section 5.3.3, we examine the final post-CMP uniformity improvements achieved with the in-pattern dummy fill, based on further CMP simulation of the layout.

5.4.1 Dummy Fill Impact on Layout

Figures 5-7 through 5-10 show the pattern density and line width maps for the MIT/SEMATECH 854 M1 layout with and without the in-pattern dummy fills. Figure 5-11 shows histograms of pattern density for four layouts, the original MIT/SEMATECH 854 M1 layout without dummy fill, and with the dummy fill strategies #3-A through #3-C. These histograms are useful in judging the range and distribution of pattern density values across the chip with and without the various dummy fill approaches. The key observation is that the in-pattern dummy fills eliminates the highest pattern densities on the chip, and specifically the bin at 100% pattern density. Thus implies that, in the dummy-filled layout, there are no 20×20μm grid cells which remain at 100% pattern density. The in-pattern dummy fills are expected to act as polishing stops due to the near-zero removal rate for the barrier and dielectric, and these structures protect the copper within several microns of each structure. The layout histograms in Figure 5-11 are promising. In the next section, we consider the impact of these layout modifications on the post-electroplating topography.
Figure 5-7. MIT/SEMATECH 854 M1 mask without dummy fills.

Figure 5-8. MIT/SEMATECH 854 M1 mask with dummy fills #3-A.

Figure 5-9. MIT/SEMATECH 854 M1 mask with dummy fills #3-B.
Figure 5-10. MIT/SEMATECH 854 M1 mask with dummy fills #3-C.

Figure 5-11. Pattern density histogram plots.
5.4.2 Dummy Fill Impact on Post-Electroplating Topography

In this section, ECD simulation results for the MIT/SEMATECH 854 M1 layout with and without in-pattern dummy fills are summarized and analyzed. Figures 5-12 through 5-15 show the post-electroplating topography maps, including the envelope and the average surface height, for the layouts with and without the in-pattern dummy fills. Again, the reference position (zero height value) is the top of the barrier layer. For the subsequent CMP process, the average surface height is the first-order factor which determines the overburden copper clearance time. As can be seen in the figures, the in-pattern dummy fills effectively enhance the copper deposition in the wide features and decrease the surface height variation. Figure 5-16 shows the histograms of average surface height for the four layouts. These demonstrate that the in-pattern dummy fills have enhanced the copper deposition at the wide features: the regions in the un-filled layout which have the smallest average surface height have been “thickened” in all of the dummy-filled layout histograms.

(a) Envelope map (Å)  
(b) Average surface height map (Å)

Figure 5-12. Post-electroplating topography maps without dummy fills.
Figure 5-13. Post-electroplating topography maps with dummy fills #3-A.

Figure 5-14. Post-electroplating topography maps with dummy fills #3-B.

Figure 5-15. Post-electroplating topography maps with dummy fills #3-C.
While these post-electroplating results are promising and confirm desired behavior, the true goal is to achieve more uniform post-CMP topography and reduced copper loss; the next section applies the coupled ECD/CMP model to evaluate the final results using different in-pattern dummy fill designs.

5.4.3 Dummy Fill Impact on Post-CMP Topography

To explore the effect of polishing slurries on in-pattern dummy fill effectiveness, two alternative hypothetical "pseudo-slurries" with different polishing behaviors are considered. In particular, we wish to compare the results for a conventional Prestonian slurry with a non-Prestonian abrasive-free polish slurry, as shown in Figure 5-17. The
CMP simulation results for the abrasive-free slurry slurry are shown first, followed by discussion of the conventional polish results.

![Graph showing removal rate vs. down force for the two hypothetical slurries.](image)

Figure 5-17. Removal rate vs. down force for the two hypothetical slurries.

5.4.3.1 Simulated Post-CMP Topography Using Abrasive-Free Slurry

Figures 5-18 through 5-21 show the post copper CMP topography maps, including the envelope and step height, for the MIT/SEMATECH 854 M1 layout with and without the in-pattern dummy fills, simulated with the hypothetical the abrasive-free slurry. The copper polishing time is set at 120 sec since the overburden copper complete clearance time for all cases is about 110 sec. The in-pattern dummy fills significantly decrease the surface height variation of the post-CMP topography, including the envelope and step height. As expected, as we use higher fractions of inserted in-pattern dummy fills, the more uniform is the post-CMP topography. However, the improvement by increasing the dummy fill area fraction is not very strong. The envelope and step-height variation range after the copper polishing step is below 500 and 800 Å, respectively. The topography
planarity is further improved in the subsequent barrier polishing step, and the multi-level issue will be dramatically decreased or become negligible.

Figure 5-18. Post-CMP topography maps without dummy fills, using hypothetical abrasive-free slurry.

Figure 5-19. Post-CMP topography maps with dummy fills #3-A, using hypothetical abrasive-free slurry.
Figure 5-20. Post-CMP topography maps with dummy fills #3-B, using hypothetical abrasive-free slurry.

Figure 5-21. Post-CMP topography maps with dummy fills #3-C, using hypothetical abrasive-free slurry.

A critical question for the in-pattern dummy fills is whether or not they degrade the electrical performance of interconnects, and if so by how much. Here, a preliminary discussion is presented based on the simulation results. The effective copper thickness with the dummy fill correction is used as a simple indicator for the electrical performance. The effective copper thickness is defined as the copper thickness in the dummy filled areas, adjusted for the lost area-fraction of the line consumed by the in-pattern dummy fills. This simplification ignores the impacts on the electrical performance.
from the copper grain size and structure, the fraction of the line consumed by the barrier metal, and any electrical impact due to scattering at the interface between copper and barrier. Figures 5-22 through 5-25 show the effective copper thickness maps for the dummy-filled areas, including the absolute value and the value relative to the non-dummy-filled case. The darkest color in the absolute value maps refers to the non-dummy fills areas; these correspond to the zero value in the relative value maps for the non-dummy-fills areas.

There are several observations based on these simulated maps. In most cases, the in-pattern dummy fills improve rather than degrade the electrical performance. That means that the decrease in dishing and erosion overwhelms the effective line width decrease resulting from the in-pattern dummy fill structures. At the same time, a significant fraction of the sites with copper thicknesses which degrade lie on the edge of the wide features, and artifacts from simulations or the discretization error might be responsible.

With the increase in the area fraction of the dummy fills, the relative effective copper thickness for the lines ranging from 10 to 20 μm in width tends to be negative. That means that the benefit in dishing and erosion reductions cannot compensate fully for the loss in the effective line width in this range. There are two ways to limit these side effects on the electrical performance: less-dense dummy fills or to raise the feature-size lower limit for the dummy fills.

Balancing the improvement in topography and electrical performance, including the effective copper thickness, the problems in the copper interface with the barrier layer, and grain size and structure, in-pattern dummy fill designs #3-A and #3-B appear to be good candidates for real implementation and experimentation.
Figure 5-22. Copper thickness maps of dummy filled areas without dummy fills, using hypothetical abrasive-free slurry.

Figure 5-23. Copper thickness maps of dummy filled areas with dummy fills #3-A, using hypothetical abrasive-free slurry.
Figure 5-24. Copper thickness maps of dummy filled areas with dummy fills #3-B, using hypothetical abrasive-free slurry.

Figure 5-25. Copper thickness maps of dummy filled areas with dummy fills #3-C, using hypothetical abrasive-free slurry.

Figure 5-26 shows the histograms of average surface height for the four layouts, which clearly illustrate that the in-pattern dummy fills enhance the copper deposition at the wide features, which have the smallest average surface height. As for the three in-pattern dummy fill layouts, the higher area fraction dummy-fill layouts (fill #3-B or #3-C) improve the mean value and sharpen the distribution of the copper thickness at the dummy filled areas. Thus the in-pattern dummy fills not only reduce the copper dishing but also narrow the distribution of the copper dishing at the dummy filled areas.
Figure 5-26. Histogram plots for copper thickness at dummy filled areas, using hypothetical abrasive-free slurry.

Figure 5-27 shows the histograms of effective average surface height for the four layouts also considering the effective line width shrinkage from the in-pattern dummy fills. This plot clearly illustrates that the in-pattern dummy fills decrease the effective copper line thickness, although the nominal copper thickness is increased. By balancing the mean value and distribution of effective copper thickness, the layout with dummy fill style #3-B appears to be a good candidate for further experimentation.
5.4.3.2 Simulated Post-CMP Topography Using Conventional Slurry

Simulations of the post-CMP topography are also performed using the conventional hypothetical Prestonian slurry. Based on these simulations, the previous observations and conclusions for the in-pattern dummy fills with the abrasive free slurry continue to hold, although the problems in dishing are somewhat worse due to more significant material removal within all recessed areas. Figures 5-28 through 5-31 show the effective copper thickness maps for the dummy-filled areas as simulated using the conventional copper slurry with Prestonian removal rate dependence on pressure.
Figure 5-28. Copper thickness maps of dummy filled areas without dummy fills, using hypothetical conventional slurry.

(a) Absolute value (Å)  
(b) Relative value (Å)

Figure 5-29. Copper thickness maps of dummy filled areas with dummy fills #3-A, using hypothetical conventional slurry.
Figure 5-30. Copper thickness maps of dummy filled areas with dummy fills #3-B, using hypothetical conventional slurry.

Figure 5-31. Copper thickness maps of dummy filled areas with dummy fills #3-C, using hypothetical conventional slurry.

Figure 5-32 shows the histograms of effective average surface height for the four layouts simulated using the second slurry with a conventional Prestonian pressure dependence. This figure clearly illustrates that the mean values of effective copper line thickness are reduced and the distributions are widened. Both are not desirable. The results further confirm the superiority of abrasive-free slurry due to the smaller down-area material removal rate resulting from the non-Prestonian pressure dependence.
5.5 Conclusions

An in-pattern dummy fill strategy involving insertion of dielectric structures within wide copper features is proposed. The newly developed physics-based integrated ECD/CMP model is used to simulate the final topography after copper over-polishing under different process conditions with various in-pattern dummy fill designs and a reduced electroplated copper thickness. The simulation results show that the optimized dummy fills can significantly improve the post-CMP topography, but in some cases can also enhance the electrical performance of interconnects even with the reduced...
electroplated copper thickness. The dummy fills reduce the effective pressure on the recessed areas and further reduce the copper removal amount by limiting the long-range pad bulk bending and the asperity local contact.

Due to limits on the area fraction of wide lines that can be consumed by the in-pattern dummy fills, the topography effects from the CMP process dominate those resulting from the ECD process. In some cases, the significantly reduced copper loss can successfully overcome the effective line width loss resulting from the dielectric dummy fills. By implementing the strategy with reduce electroplated copper thickness and in-pattern dummy fills, the process uniformity can be improved and the multilevel effect can be made negligible. At the same time, the costs on the consumables, energy and environment will be decreased by reducing plating and polish times and thicknesses. The simulation results suggest that extensions, such as the use of abrasive free slurries, in the conventional CMP technology have the potential to meet the needs of future technology nodes.

In the approach analyzed here, a low-selectivity barrier slurry is assumed, which simplifies the co-optimization process and details of the barrier removal step are ignored. In addition, we assumed a conventional situation where the copper to dielectric selectivity is high (that is to say, the copper will polish much more rapidly than the dielectric). However, in some emerging low-K and copper damascene approaches, a reverse selectivity is being considered in the industry, in which the low-K dielectric may polish substantially faster than the copper. This different selectivity between copper and low-K materials, including during the barrier polishing step, requires careful simulation to account accurately for copper, barrier, and dielectric removal. In particular, the
relatively higher removal rate of low-K materials could introduce significant dielectric loss at the wide non-patterned areas, rather than dishing in the wide copper features. In this case, between-pattern dummy fills could help to limit the low-K dielectric loss, in much the same was as we have seen for the case of in-pattern dummy fills to decrease copper loss. A more general rule is to insert dummy fills at the wide patterned or non-pattern areas having the relatively higher removal rate in copper or barrier polishing. In the more general case, between-pattern and in-pattern dummy fills can be applied together based on the specific selectivity of copper and barrier layer. The full simulation of copper and barrier removal can be used to optimize the dummy fill design, using a methodology similar to that described in this chapter.
Chapter 6

Conclusion and Future Work

In this chapter, the key results of this research and the major contributions of the thesis are summarized. Further, the possible directions for future work are suggested including further improvement and refinement of the current ECD/CMP integrated models, and further model integration with other pattern processes in the back-end-of-line (BEOL), especially film deposition and etching.

6.1 Summary and Conclusion

The research presented in this thesis has developed a physics-based integrated chip-scale electroplating/CMP pattern dependency model for multilevel copper metallization. A comprehensive modeling methodology is used, including test mask design, data characterization, model development and calibration, and simulation and validation for other layouts. Based on the integrated ECD/CMP model, process co-optimization and dummy fill design strategy are proposed to improve the conventional CMP process application in the advanced technology nodes.

A new copper ECD model has been contributed. The details of absorption and desorption processes with competing additives, especially accelerators and depressors, are introduced into the electroplating model. The previous electroplating model was a semi-empirical surface-response non-time-step model with serious limitations in the range of process conditions handled, limited flexibility in two-dimensional random
layout, and inability to predict multilevel effects. The incorporation and simplification of
the feature-scale physics-based model into the chip-scale electroplating model enables the
model to track the evolution of the process conditions, and can handle random two-di-
dimensional random layout and multilevel effects. While retaining modest computational
cost, the accuracy of modeling and simulation has been improved significantly (to \(\sim 150\)
\(\text{Å} \text{ rms error}\)) in the usual process window used in the semiconductor industry.
Furthermore, the multilevel impact on the electroplating process is modeled and
discussed. The results show that the underlying topography has almost no contribution to
the higher level electroplating process for the advanced low-dishing-and-erosion CMP.
The surface height variation is the superposition of the underlying topography and the
copper thickness variation only determined by the higher level layout.

The terminology and framework used for representing topography in the
electroplating and CMP models are further unified, in order to seamlessly integrate the
electroplating model with the CMP model. In the new CMP model, the contact wear and
density-step-height dependencies are integrated into a three-step framework and extended
into the multilevel cases. The density-step-height model is enhanced by considering the
asperity height and contact size distributions from the latest research in the area. The
details of the polishing pad and its asperities improve model accuracy and flexibility in
process modeling without increasing the computation cost significantly. Several critical
observations in the CMP modeling can improve our understanding of CMP process and
need further research. The possible temperature variation during the polishing process
could affect the slurry removal rate and the mechanical properties of the pad and its
asperities. The surface roughness change might be closely related to possible temperature
variation during CMP, and so integration of topography evolution with temperature effects may be necessary. The pattern dependencies in other processes besides plating and CMP can also affect the post-CMP topography significantly. More study is needed of these effects, including potential undercut of narrow dielectric posts, variation in the barrier metal deposition thickness as a function of feature size, or width or depth variation in the plasma etch due to pattern density or feature size dependencies.

The calibrated and validated integrated ECD/CMP model has wide applications in process development and optimization, layout improvement, and dummy fill design. In this thesis, the ECD/CMP co-optimization and in-pattern dummy fills are presented as an example of this kind of application of the model. Several technology improvements are emerging in industry, including alternative low-down-force low-dishing-and-erosion processes such as electrochemical-mechanical polishing (ECMP). To meet the needs of advanced technology nodes and ever-more stringent requirements with feature size shrinkage and the implementation of the low-K materials, conventional CMP techniques need to be improved in terms of dishing and erosion, defects, cost and yield. The ECD/CMP co-optimization demonstrates that 10% in-pattern dummy fills can effectively improve not only the topographical evenness but also the effective copper thickness by dummy fill correction, and the throughput also can be improved by reducing the electroplated copper thickness. However, the impact on the resistivity of the electroplated copper thickness and in-pattern dummy fills have to be further investigated, including the electrical resistance and area tradeoffs associated with in-pattern dummy fill.

In summary, this research has made the following contributions:
1. Developed a chip-scale copper electroplating model that incorporates the effects of additives, feature-scale dependencies, and copper depletion, and can be applied to random layouts.

2. Developed an improved copper CMP model integrating contact wear, pattern density, and step height dependence in order to improve the prediction of dishing and erosion for random layouts.

3. Improved layout parameter extraction procedures and terminology for topographical features, enabling the seamless integration between electroplating and CMP models in chip-scale simulation and prediction for random layouts.

4. Characterized and validated both metal 1 and metal 2 electroplating and CMP pattern dependent effects, and extended the integrated model into the multilevel cases.

5. Identified the impacts of temperature variation and pattern dependency of other processes on the CMP modeling.

6. Illustrated the co-optimization of electroplating and CMP and demonstrated the significant improvement in the topography and effective copper thickness from the use of in-pattern dummy fills.

6.2 Future Work

Two possible directions of future work are suggested in this section, including improvement and refinement of the current ECD/CMP integrated models, and seamless model integration with other pattern dependent processes in the back-end-of-line (BEOL), especially film deposition and etching.
The current electroplating model only considers two additives, accelerators and suppressors. In the latest copper electroplating technology, three additives including accelerators, suppressors and levelers are used to achieve super filling without bump formation to improve the post-electroplating topography evenness. Given data from three-additive copper deposition, the current electroplating model can be extended by considering the competitive absorption among the three additives.

The details of the polishing pad and its asperities have been included in the CMP model to improve the model accuracy and flexibility. More details from the slurry can further enhance our understanding of the pattern dependency in CMP. As for the in-pattern dummy fills, the preliminary simulation results have been presented. The real implementation in the product layout can validate the simulation results and improve the dummy fill design, and more importantly address the impacts of the electroplated copper thickness and dummy fills on the copper resistivity, which are ignored in the preliminary modeling and simulation, and investigate the electrical performance at high frequency with in-pattern and between-pattern dummy fills.

Another direction is to develop a pattern dependency simulator for BEOL by expanding the current integrated ECD/CMP model to cover other process steps used in multilevel copper metallization. In this work, we have also found that pattern dependencies in other interconnect processes substantially impact the polishing and final results. While these pattern dependencies have not been discussed here, we believe that further research is needed to characterize and model not only chip-scale dependencies in electroplating and CMP, but also in barrier and dielectric deposition, as well as
lithography and etch. Ultimately, a chip-scale back end of line simulator can help identify and improve both these unit processes and their integration.
Bibliography


[54] T. S. Cale, private communication.


