2019-06-17
§
|
09:36 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
09:36 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
09:31 |
<elukey> |
set cpu governor to performance (was powersave) on analytics1070 (hadoop worker node) |
[production] |
09:17 |
<moritzm> |
rebooting sulfur for some tests |
[production] |
09:15 |
<_joe_> |
The governor was set to "powersave", not "ondemand" |
[production] |
09:13 |
<_joe_> |
setting cpufreq governor to "ondemand" on mw1348, T225713 |
[production] |
08:52 |
<onimisionipe> |
remove maps1001 from cassandra cluster - T224395 |
[production] |
07:25 |
<XioNoX> |
restart snmp daemon on mr1-eqsin |
[production] |
07:10 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-codfw.php: Repool db2107 (duration: 00m 47s) |
[production] |
06:22 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-codfw.php: Repool db2084 (duration: 00m 47s) |
[production] |
06:12 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-codfw.php: Depool db2084 for a reboot (duration: 00m 48s) |
[production] |
06:04 |
<marostegui> |
Stop MySQ on db2084 to reboot the host T225884 |
[production] |
05:16 |
<marostegui> |
Stop MySQL on db2107 to clone db2051 - T221533 |
[production] |
05:11 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-codfw.php: Depool db2107 to clone db2051 (duration: 00m 47s) |
[production] |
05:03 |
<marostegui> |
Optimize all pc1008's tables T210725 |
[production] |
05:03 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Depool pc1008 and pool pc1010 temporarily while pc1008 gets all its tables optimized T210725 (duration: 00m 59s) |
[production] |
2019-06-14
§
|
23:23 |
<ejegg> |
updated payments-wiki from 75abd71cc1 to 79d1822644 |
[production] |
23:19 |
<SMalyshev> |
repooled wdqs1003 |
[production] |
23:13 |
<SMalyshev> |
repooled wdqs2003 |
[production] |
23:10 |
<_joe_> |
set cpufreq governor for mw1348 to performance |
[production] |
19:56 |
<SMalyshev> |
depooled wdqs2003 to catch up |
[production] |
19:17 |
<SMalyshev> |
depooled wdqs1003 to catch up |
[production] |
15:56 |
<gehel> |
repooling wdqs1003, not catching up anyway (high edit load) |
[production] |
15:24 |
<godog> |
test setting 'performance' governor on ms-be2035 - T210723 |
[production] |
14:35 |
<godog> |
powercycle mw1294, down and no console |
[production] |
13:26 |
<gehel> |
depooling wdqs1003 to allow it to catch up on lag |
[production] |
13:22 |
<joal@deploy1001> |
Started restart [analytics/aqs/deploy@fc1d232]: (no justification provided) |
[production] |
12:38 |
<godog> |
test setting 'performance' governor on ms-be2032 - T210723 |
[production] |
11:36 |
<godog> |
test setting 'performance' governor on ms-be2034 - T210723 |
[production] |
10:22 |
<marostegui> |
Optimize tables on pc2008 - T210725 |
[production] |
10:17 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Fully repool db1077 after recovering from a crash (duration: 00m 49s) |
[production] |
10:14 |
<godog> |
test setting 'performance' governor on ms-be2031 - T210723 |
[production] |
09:44 |
<godog> |
test setting 'performance' governor on ms-be2037 - T210723 |
[production] |
09:43 |
<godog> |
test setting 'performance' governor on ms-be2033 - T210723 |
[production] |
09:28 |
<godog> |
test setting 'performance' governor on ms-be2038 - T210723 |
[production] |
09:26 |
<godog> |
test setting 'performance' governor on ms-be2016 - T210723 |
[production] |
03:57 |
<SMalyshev> |
repooled wdqs1005 |
[production] |
00:11 |
<SMalyshev> |
depooled wdqs1005 - let it catch up |
[production] |
00:10 |
<SMalyshev> |
repooled wdqs1006 - caught up |
[production] |