Category Archives: Linux Hardware

iSCSI connection states in Open-iSCSI

This is the iSCSI connection state if the underlying network interface changes from "UP BROADCAST RUNNING MULTICAST" to "UP BROADCAST MULTICAST".

Log entries showing that the network interface has no longer the state "RUNNING":

Jul 3 14:17:31 host kernel: [974138.571169] bnx2 0000:08:05.0 eth2: NIC Copper Link is Down
Jul 3 23:05:05 host kernel: [1005760.957474] sd 10:0:0:0: rejecting I/O to offline device
... previous message repeats many times ...

Checking iSCSI connection state:

# iscsiadm -m session -P1
...
iSCSI Connection State: TRANSPORT WAIT
iSCSI Session State: FREE
Internal iscsid Session State: REOPEN

Log entry once the network interface state is back to "RUNNING":

Jul 4 06:56:31 host kernel: [1034019.191222] bnx2 0000:08:05.0 eth2: NIC Copper Link is Up, 1000 Mbps full duplex

Checking iSCSI connection state again:

# iscsiadm -m session -P1
...
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE

The output of "iscsiadm -m session -P1" can be used for monitoring the iSCSI connection e.g. in a simple Nagios or Icinga Perl script.

Share

Buying a used SATA disk

With the evolution of SSD drives, people are selling their old magnetic disks on Ebay or other platforms really cheap. Here are some steps to take after plugging in a bought SATA drive into your Linux system.

Keep in mind that all disk information probably can be manipulated, including model name, serial number, firmware, etc.

Check general drive information

# hdparm -I /dev/sdx
- Model Number
- Serial Number (to identify physical drive e.g. in case of replacement)
- Nominal Media Rotation Rate
- DMA: udma6
- Write cache
- SMART error logging
- SMART self-test
- SCT Error Recovery Control (if used in a RAID array)
- Security: Passwort not enabled/locked
(Enabled features are preceded by *)

Check SMART capabilities

# smartctl -i /dev/sdx
- SMART support is: Availabe
- SMART support is: Enabled (if not, enable it with "smartctl -s on /dev/sdx")

Check detailed SMART information

# smartctl -a /dev/sdx
- Model Family/Device Model
- User Capacity
- Rotation Rate
- SATA Version: current speed
- Vendor specific SMART attributes:
o Start_Stop_Count
    (Usually the same as Power_Cycle_Count)
o Reallocated_Sector_Ct
    (Bad sectors that have been marked by the disk?)
o Power_On_Hours
    (Disk has been used 24/7 as a NAS drive?)
o Power_Cycle_Count
    (Usually the same as Start_Stop_Count)
o G-Sense_Error_Rate
   (Disk has been dropped on the floor?)
o Load_Cycle_Count
   (Usually the same as Start_Stop_Count and Power_Cycle_Count)
o Temperature_Celsius
- SMART Error Log (Are there any entries?)
- SMART Self-test (Anything other than "Completed without error")

Temperature history and SCT

# smartctl -x /dev/sdx
- Temperature history
- SCT Error Recovery Control
    (Only important for use in RAID arrays, see one of my previous posts)

SMART tests

SMART tests do not degrade drive performance, they are more like collecting statistical data from the drive. Online and offline tests can be executed during normal operation.

# smartctl -t long /dev/sdx
Expected output:

root@linux:~# smartctl -t long /dev/sdx

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 103 minutes for test to complete.
Test will complete after Fri May 20 09:48:56 2016

Use smartctl -X to abort test.

Check test result in drive logs:

# smartctl -l selftest /dev/sdx
Expected output:

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 37143 -

# smartctl -l error /dev/sdx
Expected output:

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

Conclusion

So what are you doing if some of the values are not looking right? Don't worry. The drive might still be working without problems for one or two years. But you should have an eye on it:

  • Run regular SMART tests to see if error rate / reallocated sector count increases.
  • Don't use the drive as a sole disk medium for critical high performance data. Maybe make it part of a RAID 1 or RAID 6 array or use it as a hotspare / cold standby drive. Even then you should run regular SMART tests on the drive.
  • Make sure that the temperature is not too high (should be somewhere around 40 degrees celcius and is dependent on the drive).
  • Minimize power cycles. Using a worn out disk on a PC or laptop that gets rebooted a couple of times every day is not a good idea.
  • If you can afford it, use more file system and RAID caching to minimize disk reads and writes. RAID controllers usually support writethrough and writeback. While writeback minimizes disk writes, it should only be used on battery backed or flash backed RAID controllers. Don't use software RAID or fake RAID controllers.
  • On Linux there is a tool called iotop to identify processes with heavy read/write operations. Reconfigure your system to use different disks.
  • Run the disk for a couple of days without any important data on it. Check SMART values and see if you are comfortable with it.
  • Make frequent backups of your data to prepare for disk failure. Don't use the drive as a backup medium.
  • Don't use the drive in high availability environments (with or without RAID). If you use the drive in your laptop or a PC without RAID, make sure to have a spare drive at hand and make daily backups.

Why should I go through all this and not buy a new drive? At least its more reliable and the higher price will pay off.

Even new drives might fail within a couple of days or weeks without any prior signs. There is no guarantee that a drive - new or old - will not stop working from one second to the next. Of course older drives are more likely to fail than new drives (see MTBF / load/unload cycles / power-on hours / warranty duration of your drive specification data-sheet).

But if you prepare carefully for disk failures and minimize the risk, you can save some money and spend it for that brand new SSD drive that will be out on the market in one year. SSD technology is progressing rapidly and it might be worth waiting for the prices to drop.

Share

Western Digital 3,5" disk drives

Because of its inexpensive price, WD drives are very well suitable for SOHO markets. Most of them have a SATA interface unless otherwise noted. SAS drives usually consume more power, but despite of having a smaller cache they range at the upper scale of performance compared to similar SATA drives.

Caviar Green (cool, quiet, decreased power)
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-701229.pdf
up to 3 TB
110 MB/s (123-150 MB/s for *ZRX/*ZDX models)
Intellipower, so no fixed rotational speed (RPM)
64 MB cache
300,000 load cycles
2.1-5.5 W power in idle mode (less power for *ZRX/*ZDX models)
2 years warranty

Green (cool, quiet, decreased power)
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-771438.pdf
up to 4 TB
ca. 150 MB/s
Intellipower, so no fixed rotational speed (RPM)
64 MB cache
300,000 load cycles
2.5-3.3 W power in idle mode
2 years warranty

Caviar Blue (standard desktop)
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-701277.pdf
up to 1 TB
126-150 MB/s
7200 RPM
8-64 MB cache
300,000 load cycles
4.9-6.1 W power in idle mode
2 years warranty

Blue (standard desktop, energy efficient for non *X models)
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-771436.pdf
up to 6 TB
126-175 MB/s (>= 147 MB/s for *Z?? models)
5400 RPM (7200 RPM for *X models)
16-64 MB cache
300,000 load cycles
2.5-3.4 W power in idle mode (>= 4.9 W for *X models)
2 years warranty

Caviar Black (desktop performance)
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-701276.pdf
up to 2 TB
126-150 MB/s
7200 RPM
64 MB cache (32 MB for *LX model)
300,000 load cycles
5.6-8.2 W power in idle mode
5 years warranty

Black (desktop high performance)
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-771434.pdf
up to 6 TB
150-218 MB/s
7200 RPM
64-128 MB cache
300,000 load cycles
6.1- 7.6 W power in idle mode (8.1 W for *FZEX models)
5 years warranty

Red / Red Pro (NAS Storage, *CX models are 2,5", *FF?? are Pro models and faster)
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-800002.pdf
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-800022.pdf
up to 6 TB (up to 1 TB for *CX models)
147-214 MB/s (144 MB/s for *CX models)
Intellipower, so no fixed rotational speed (7200 RPM for *FF?? Pro models)
64-128 MB cache (16 MB for *CX models)
600,000 load cycles
1,000,000 hours MTBF
2.3-3.4 W power in idle mode (0.6 W for *CX models, >= 4.9 W for *FF?? Pro models)
3 years warranty (5 years warranty for Pro models)

Caviar Re (RAID Edition with PATA interface)
http://support.wdc.com/product.aspx?ID=504&lang=en

Re (RAID Edition)
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-800044.pdf
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-800066.pdf
SAS: http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-771386.pdf
up to 6 TB
128-225 MB/s
7200 RPM
32-128 MB cache
600,000 load cycles
1,200,000-2,000,000 hours MTBF
0.63% AFR
4.4-9.2 W power in idle mode
5 years warranty

Se (Datacenter capacity, increased reliablity)
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-800042.pdf
up to 6 TB
164-214 MB/s
7200 RPM
64-128 MB cache
300,000 load cycles
800,000-1,200,000 hours MTBF
4.6-8.1 W power in idle mode
5 years warranty

Ae (Datacenter archive, spin-down capability for cold data)
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-800045.pdf
6 TB
> 150 MB/s
5760 RPM
64 MB cache
300,000 load cycles
500,000 MTBF
4.8 W power in idle mode
3 years warranty

Xe (Datacenter, SAS)
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-771463.pdf
up to 900 GB
204 MB/s
10.000 RPM
32 MB cache
600,000 load cycles
2,000,000 hours MTBF
5.2 W power in idle mode
5 years warranty

 

Share

SCT Error Recovery Control in RAID drives

SCT ERC (Smart Command Transfer Error Recovery Control) controls how much time a drive spends trying to fix read/write errors for defect sectors. After that time has expired, the drive just gives up on fixing the problem itself and reports a read/write failure to the RAID controller. This prevents the RAID array from being degraded just because one drive has a single defect sector. RAID recovery might take a long time and stresses all remaining drives.

Linux's mdraid handles the ERC timeout as follows:
- Read missing data from other RAID devices
- Overwrite bad block
- Reread bad block
If overwrite or reread of bad block fails again, then finally the drive will be disabled and the array will be degraded.

Hard drive manufacturers have different names for this error recovery feature:
- Western Digital: TLER (For WD Re drives, this feature cannot be disabled, and timeout is fixed to 7 seconds, s. here http://support.wdc.com/KnowledgeBase/answer.aspx?ID=1478. For WD Red drives, this feature can be configured.)
- Seagate: ERC (e.g. for Barracuda ES and ES.2 family SATA enterprise drives, s. here http://knowledge.seagate.com/articles/en_US/FAQ/203991en?language=en_US)
- Samsung, Hitachi: CCTL

The drive's timeout should be lower than the RAID controller timeout. Check the current timeout of your disk drive:

$ smartctl -l scterc /dev/sda
...
SCT Error Recovery Control command not supported
(If ERC is not supported by the drive, it might be a cheap desktop model.)

Set disk read and write timeout to 20 seconds:

$ smartctl -l scterc,200,200 /dev/sda

Check mdraid controller timeout of Linux's software raid:

$ cat /sys/block/sda/device/timeout
30

 

Share

CSL 300 Mbit/s wifi adapter with Debian 8 Jessie

The CSL 300 Mbit/s wifi adapter is available at Amazon and is an inexpensive wifi USB adapter for Linux. It supports 802.11 b/g/n, WPA2, and has an external antenna adapter.

It identifies as follows with "sudo lsusb":

Bus 003 Device 002: ID 0bda:8172 Realtek Semiconductor Corp. RTL8191SU 802.11n WLAN Adapter

The loaded kernel module is "r8712u" (check with "sudo lsmod | grep r8712u").

To make it work with Debian Jessie, all you need to do is to install the standard Debian package "firmware-realtek". The output in "kern.log" after installing the package and plugging in the USB adapter should look something like this:

Sep 27 13:50:37 computername kernel: [    9.617950] r8712u: module is from the staging directory, the quality is unknown, you have been warned.
Sep 27 13:50:37 computername kernel: [    9.618985] r8712u: Staging version
Sep 27 13:50:37 computername kernel: [    9.619009] r8712u: register rtl8712_netdev_ops to netdev_ops
Sep 27 13:50:37 computername kernel: [    9.619014] usb 4-2: r8712u: USB_SPEED_HIGH with 4 endpoints
Sep 27 13:50:37 computername kernel: [    9.619553] usb 4-2: r8712u: Boot from EFUSE: Autoload OK
Sep 27 13:50:37 computername kernel: [   10.284174] usb 4-2: r8712u: CustomerID = 0x000a
Sep 27 13:50:37 computername kernel: [   10.284178] usb 4-2: r8712u: MAC Address from efuse = 20:ac:3f:b9:b9:b9
Sep 27 13:50:37 computername kernel: [   10.284181] usb 4-2: r8712u: Loading firmware from "rtlwifi/rtl8712u.bin"
Sep 27 13:50:37 computername kernel: [   10.284258] usbcore: registered new interface driver r8712u
Sep 27 13:50:37 computername kernel: [   10.348992] usb 4-2: firmware: direct-loading firmware rtlwifi/rtl8712u.bin
Share