| 
      
        2018-10-19
      
      §
     | 
  
    
  | 10:53 | 
  <arturo> | 
  icinga downtime for 2h for clounet1003/1004 to deploy patch related to T206261 | 
  [production] | 
            
  | 09:37 | 
  <godog> | 
  bump /proc/sys/net/core/rmem_default temporarily to 6MB and bounce statsd-proxy statsite-instances on graphite1004 - T196484 | 
  [production] | 
            
  | 08:53 | 
  <banyek> | 
  adding wmf-pt-kill_2.2.20-1+wmf4 package for stretch (T206521) | 
  [production] | 
            
  | 08:28 | 
  <jynus> | 
  stopping db1092 and db1087 in sync | 
  [production] | 
            
  | 07:50 | 
  <godog> | 
  bump /proc/sys/net/core/rmem_default temporarily to 2MB and bounce statsd-proxy statsite-instances on graphite1004 - T196484 | 
  [production] | 
            
  | 07:20 | 
  <marostegui> | 
  Remove mwmaint1001 grants from m5 - https://phabricator.wikimedia.org/T201343 https://phabricator.wikimedia.org/T192457 | 
  [production] | 
            
  | 07:15 | 
  <godog> | 
  powercycle ms-be1021, [19601329.556259] sd 0:1:0:1: rejecting I/O to offline device | 
  [production] | 
            
  | 07:05 | 
  <godog> | 
  bump /proc/sys/net/core/rmem_default temporarily to 1MB and bounce statsd-proxy statsite-instances on graphite1004 - T196484 | 
  [production] | 
            
  | 06:13 | 
  <marostegui> | 
  Deploy schema change on s7 codfw host by host without replication - T204006 | 
  [production] | 
            
  | 05:58 | 
  <marostegui> | 
  Deploy schema change on s2 codfw host by host without replication - T204006 | 
  [production] | 
            
  | 05:25 | 
  <marostegui> | 
  Deploy schema change on s1 codfw host by host without replication - T204006 | 
  [production] | 
            
  | 01:49 | 
  <krinkle@deploy1001> | 
  Synchronized php-1.32.0-wmf.26/extensions/WikimediaEvents/includes/WikimediaEventsHooks.php: Ic74a9d5601b8c (duration: 00m 55s) | 
  [production] | 
            
  
    | 
      
        2018-10-18
      
      §
     | 
  
    
  | 22:00 | 
  <mutante> | 
  lvs1011,lvs1012 - manually editing nagios NRPE config and restarting service (to make monitoring from icinga1001 work and puppet is disabled) | 
  [production] | 
            
  | 21:52 | 
  <mutante> | 
  eeden - manually editing nagios NRPE config and restarting service (to make monitoring from icinga1001 work and puppet is disabled) | 
  [production] | 
            
  | 21:49 | 
  <twentyafterfour@deploy1001> | 
  rebuilt and synchronized wikiversions files: group2 wikis to 1.32.0-wmf.26  refs T191072 | 
  [production] | 
            
  | 21:46 | 
  <twentyafterfour@deploy1001> | 
  Synchronized php-1.32.0-wmf.26/includes/filerepo/file/LocalFile.php: sync Id97e1c7c2655d90928c777bc3377e5ea23f49f6b  refs T207419 (duration: 00m 53s) | 
  [production] | 
            
  | 21:29 | 
  <twentyafterfour@deploy1001> | 
  Synchronized php-1.32.0-wmf.26/includes/filerepo/file/LocalFile.php: sync https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/468470/ refs T207419 (duration: 00m 54s) | 
  [production] | 
            
  | 20:49 | 
  <twentyafterfour@deploy1001> | 
  rebuilt and synchronized wikiversions files: group2 wikis to 1.32.0-wmf.24  refs T191072 | 
  [production] | 
            
  | 20:39 | 
  <twentyafterfour@deploy1001> | 
  rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.26 | 
  [production] | 
            
  | 20:21 | 
  <volans> | 
  start ferm on db2042, it failed to start at reboot due to DNS resolution timeout | 
  [production] | 
            
  | 19:22 | 
  <ejegg> | 
  updated SmashPig standalone deploy from 5f21d3f2db to 581c685326 | 
  [production] | 
            
  | 19:21 | 
  <ejegg> | 
  updated payments-wiki from a3892e4ed3 to 06848600ed | 
  [production] | 
            
  | 19:17 | 
  <shdubsh> | 
  rebooting graphite1004 | 
  [production] | 
            
  | 19:11 | 
  <shdubsh> | 
  upping ring buffer size on graphite1004 in an attempt to mitigate dropped packets at the interface -- T196484 | 
  [production] | 
            
  | 19:02 | 
  <sbisson@deploy1001> | 
  Synchronized php-1.32.0-wmf.26/extensions/PageTriage/: SWAT: [[gerrit:468384|Use Main Object Stash for keeping track of PageTriage last use]] (duration: 00m 54s) | 
  [production] | 
            
  | 18:19 | 
  <awight> | 
  Restarting ORES services for T88997 | 
  [production] | 
            
  | 17:33 | 
  <ladsgroup@deploy1001> | 
  Finished deploy [ores/deploy@4ac4c8b]: Logstash support for ores: T181546 T169586 T168921 T181630 T205256 (duration: 23m 48s) | 
  [production] | 
            
  | 17:19 | 
  <herron> | 
  aborted enabling kafka on logstash elasticsearch cluster due to puppet errors. reverted change T206454 | 
  [production] | 
            
  | 17:09 | 
  <ladsgroup@deploy1001> | 
  Started deploy [ores/deploy@4ac4c8b]: Logstash support for ores: T181546 T169586 T168921 T181630 T205256 | 
  [production] | 
            
  | 17:00 | 
  <twentyafterfour@deploy1001> | 
  Synchronized php: group1 wikis to 1.32.0-wmf.26  refs T191072 (duration: 00m 53s) | 
  [production] | 
            
  | 16:59 | 
  <twentyafterfour@deploy1001> | 
  rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.26  refs T191072 | 
  [production] | 
            
  | 16:57 | 
  <herron> | 
  enabling kafka on logstash elasticsearch cluster T206454 | 
  [production] | 
            
  | 16:55 | 
  <twentyafterfour@deploy1001> | 
  Synchronized php-1.32.0-wmf.26/extensions/WikibaseQualityConstraints/src/ServiceWiring.php: sync https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikibaseQualityConstraints/+/468352/ refs T207394 (duration: 00m 54s) | 
  [production] | 
            
  | 16:52 | 
  <mobrovac@deploy1001> | 
  Finished deploy [restbase/deploy@6c879fa]: Have 100% of traffic directed to Proton as well - T186748 (duration: 20m 52s) | 
  [production] | 
            
  | 16:31 | 
  <mobrovac@deploy1001> | 
  Started deploy [restbase/deploy@6c879fa]: Have 100% of traffic directed to Proton as well - T186748 | 
  [production] | 
            
  | 15:51 | 
  <XioNoX> | 
  trunk cloud-instances2-b-eqiad between asw-b-eqiad and asw2-b-eqiad | 
  [production] | 
            
  | 15:50 | 
  <cmjohnson1> | 
  disabling checks on cloudvirt1019  for maintenance | 
  [production] | 
            
  | 15:42 | 
  <twentyafterfour> | 
  twentyafterfour@deploy1001 Synchronized php: group1 wikis to 1.32.0-wmf.24  refs T191072 (duration: 00m 53s) | 
  [production] | 
            
  | 15:40 | 
  <twentyafterfour@deploy1001> | 
  Synchronized php: group1 wikis to 1.32.0-wmf.24  refs T191072 (duration: 00m 53s) | 
  [production] | 
            
  | 15:39 | 
  <twentyafterfour@deploy1001> | 
  rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.24  refs T191072 | 
  [production] | 
            
  | 15:35 | 
  <twentyafterfour@deploy1001> | 
  scap failed: average error rate on 6/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details) | 
  [production] | 
            
  | 14:46 | 
  <moritzm> | 
  installing tomcat8 security updates | 
  [production] | 
            
  | 14:34 | 
  <moritzm> | 
  remove labvirt1018 from debmonitor (T207317) | 
  [production] | 
            
  | 14:28 | 
  <godog> | 
  temporarily bump default socket receive memory to 1MB on graphite1001, restart statsd-proxy and statsite | 
  [production] | 
            
  | 14:22 | 
  <godog> | 
  begin reformat of ms-be2041 - T199198 | 
  [production] | 
            
  | 14:21 | 
  <banyek> | 
  shutting down mysql and powering down db2042 (T202051) | 
  [production] | 
            
  | 14:13 | 
  <godog> | 
  corrections to the statements above, graphite1004 not graphite1001 | 
  [production] | 
            
  | 14:11 | 
  <godog> | 
  ditto for statsite instances on graphite1001, temporarily bump receive socket memory to 1MB and bounce the service | 
  [production] | 
            
  | 14:08 | 
  <godog> | 
  temporarily bump receive socket memory for statsd-proxy on graphite1001 and bounce the service | 
  [production] | 
            
  | 13:51 | 
  <moritzm> | 
  installing libidn security updates | 
  [production] |