1901-1950 of 10000 results (42ms)
2017-07-05 §
12:11 <apergos> puppet is currently disabled again on snapshots 1,5,6,7 and on dataset1001; we saw the same nfs issue shortly after reboot, with no dump processes going, as snapshots 5,6,7 had not remounted the filesystem [production]
11:20 <moritzm> rebooting wtp2* servers for kernel update [production]
11:14 <moritzm> rebooting restbase1009 for kernel update [production]
10:56 <hashar> restarting Jenkins for plugin upgrades [production]
10:45 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1072 - T168661 (duration: 02m 59s) [production]
10:41 <marostegui> Run redact_sanitarium on s6 databases db1102 - T153743 [production]
10:41 <moritzm> rebooting wtp1001 for kernel update [production]
10:37 <moritzm> rebooting restbase1008 for kernel update [production]
10:32 <apergos> rebooting snapshot hosts to clean up hung nfs client processes [production]
10:30 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1072 - T168661 (duration: 02m 51s) [production]
10:24 <apergos> rebooted dataset1001 to unstick nfsd and pick up new kernel, re-enabled puppet [production]
10:14 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1066 - T168661 (duration: 02m 50s) [production]
10:11 <moritzm> rebooting restbase1007 for kernel update [production]
10:01 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1066 - T168661 (duration: 02m 50s) [production]
09:57 <marostegui> Deploy alter table on s1 eqiad hosts - T168661 [production]
09:48 <godog> move 'instances' graphite hierarchy out of the way, do not delete yet - T143405 [production]
09:27 <marostegui> Stop MySQL on db1085 for maintenance - T153743 [production]
09:21 <godog> upload nginx_1.11.10-1+wmf2 to jessie-wikimedia and nginx_1.11.10-1+wmf2~stretch1 to stretch-wikimedia [production]
09:17 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1085 - T153743 (duration: 02m 50s) [production]
08:44 <apergos> puppet disabled and processes accessing dataset1001 exported filesystem shot, on: stat1002,3, snapshot1001,5,6,7, while investigation continues [production]
07:27 <moritzm> rebooting restbase-dev* for kernel update [production]
07:13 <moritzm> rebooting notebook* hosts [production]
05:18 <marostegui> Deploy alter table on s3 master - db1075 - T168661 [production]
05:13 <marostegui> Deploy alter table on s7 master - db1062 - T168661 [production]
05:08 <marostegui> Force a relearn on db1046's BBU - T166141 [production]
02:27 <l10nupdate@tin> scap sync-l10n completed (1.30.0-wmf.7) (duration: 10m 23s) [production]
2017-07-04 §
21:40 <volans> ACK'ed puppet not running on stat100[2-3],snapshot100[1,5-7] due to NFS overloaded on dataset1001 - T169680 [production]
16:54 <jynus> dropping ukwikimedia from several labsdbhosts [production]
16:10 <moritzm> rebooting radium for kernel update [production]
15:09 <mobrovac@tin> Finished deploy [citoid/deploy@9d22567]: Fallback to crossRef (T165105) and use MarcXML (T165105) (duration: 02m 52s) [production]
15:06 <mobrovac@tin> Started deploy [citoid/deploy@9d22567]: Fallback to crossRef (T165105) and use MarcXML (T165105) [production]
15:02 <godog> set operations/debs/nginx as hidden and update description [production]
14:57 <ema> pybal 1.13.7 uploaded to apt.w.o, testing it on pybal-test2001 T82747 T154759 [production]
14:31 <godog> copy nginx from jessie-wikimedia to stretch-wikimedia [production]
14:15 <paravoid> reset db2038's iLO [production]
13:06 <filippo@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ms-fe2005.codfw.wmnet [production]
11:47 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Remove comments from db1039 status - T166208 (duration: 02m 50s) [production]
11:25 <joal@tin> Finished deploy [analytics/refinery@88cbb9e]: Regular weekly deploy (2) - Bug patch (duration: 03m 38s) [production]
11:21 <joal@tin> Started deploy [analytics/refinery@88cbb9e]: Regular weekly deploy (2) - Bug patch [production]
11:15 <elukey> powercycle elastic1018, host unreachable [production]
11:02 <joal@tin> Finished deploy [analytics/refinery@12c5f57]: Regular weekly deploy (duration: 04m 47s) [production]
11:00 <moritzm> rebooting kubernetes workers for kernel update [production]
10:58 <godog> copy wikimedia-lvs-realserver from jessie-wikimedia to stretch-wikimedia [production]
10:57 <joal@tin> Started deploy [analytics/refinery@12c5f57]: Regular weekly deploy [production]
10:53 <gehel> killing stuck wmf-reimage on puppetmaster1001 for maps-test2001 [production]
10:40 <marostegui> Stop replication on db1102 (sanitarium3) on s2 shard for maintenance - T153743 [production]
10:33 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1060 - T153743 (duration: 02m 49s) [production]
10:23 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1035 - T168661 (duration: 02m 49s) [production]
10:14 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1035 - T168661 (duration: 02m 50s) [production]
09:58 <marostegui> Move labsdb1009 main general replication thread to a named replication thread called db1095 - T153743 [production]