Alex Povolotsky
2012-12-22 08:47:24 UTC
Hello,
I'm running FreeBSD 9.0/amd64, pure ZFS setup, one Seagate disk
ST2000NM0011 SN02 on LSI Logic (mpt) controller.
Yes, I know that running one disk on RAID controller is a bit weird, I
have to find yet if it is possible to connect disk to internal SATA
controller.
About two days ago, system became SLOW. Disk usage is constantly 100%,
and sometimes I'm getting swap_pager: indefinite wait buffer error. I
had to reset computer twice in two days.
mptutil does not show any errors, and smartctl shows
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 067 063 044 Pre-fail
Always - 6218970
3 Spin_Up_Time 0x0003 093 092 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 14
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 21
7 Seek_Error_Rate 0x000f 091 060 030 Pre-fail
Always - 1433294073
9 Power_On_Hours 0x0032 090 090 000 Old_age
Always - 8825
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always - 16
184 End-to-End_Error 0x0032 100 100 099 Old_age
Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age
Always - 0
188 Command_Timeout 0x0032 100 099 000 Old_age
Always - 12885098499
189 High_Fly_Writes 0x003a 100 100 000 Old_age
Always - 0
190 Airflow_Temperature_Cel 0x0022 068 047 045 Old_age
Always - 32 (Min/Max 31/32)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age
Always - 859
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age
Always - 15
193 Load_Cycle_Count 0x0032 100 100 000 Old_age
Always - 26
194 Temperature_Celsius 0x0022 032 053 000 Old_age
Always - 32 (0 21 0 0 0)
195 Hardware_ECC_Recovered 0x001a 103 099 000 Old_age
Always - 6218970
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
SMART Error Log Version: 1
No Errors Logged
I have removed most of snapshots, it does not help.
I have stopped all active processes, disk load did not decrease, same 100%.
What can I check and/or replace to get the problem fixed? Any ideas?
Alex
I'm running FreeBSD 9.0/amd64, pure ZFS setup, one Seagate disk
ST2000NM0011 SN02 on LSI Logic (mpt) controller.
Yes, I know that running one disk on RAID controller is a bit weird, I
have to find yet if it is possible to connect disk to internal SATA
controller.
About two days ago, system became SLOW. Disk usage is constantly 100%,
and sometimes I'm getting swap_pager: indefinite wait buffer error. I
had to reset computer twice in two days.
mptutil does not show any errors, and smartctl shows
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 067 063 044 Pre-fail
Always - 6218970
3 Spin_Up_Time 0x0003 093 092 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 14
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 21
7 Seek_Error_Rate 0x000f 091 060 030 Pre-fail
Always - 1433294073
9 Power_On_Hours 0x0032 090 090 000 Old_age
Always - 8825
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always - 16
184 End-to-End_Error 0x0032 100 100 099 Old_age
Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age
Always - 0
188 Command_Timeout 0x0032 100 099 000 Old_age
Always - 12885098499
189 High_Fly_Writes 0x003a 100 100 000 Old_age
Always - 0
190 Airflow_Temperature_Cel 0x0022 068 047 045 Old_age
Always - 32 (Min/Max 31/32)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age
Always - 859
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age
Always - 15
193 Load_Cycle_Count 0x0032 100 100 000 Old_age
Always - 26
194 Temperature_Celsius 0x0022 032 053 000 Old_age
Always - 32 (0 21 0 0 0)
195 Hardware_ECC_Recovered 0x001a 103 099 000 Old_age
Always - 6218970
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
SMART Error Log Version: 1
No Errors Logged
I have removed most of snapshots, it does not help.
I have stopped all active processes, disk load did not decrease, same 100%.
What can I check and/or replace to get the problem fixed? Any ideas?
Alex