| 
      
        2016-06-22
      
      §
     | 
  
    
  | 19:32 | 
  <bblack> | 
  start rollout of first batch of cache sysctl stuff (un-mysterious + disable prequeue timestamps) | 
  [production] | 
            
  | 19:29 | 
  <jynus> | 
  archiving and dropping reviewdb on m1 shard | 
  [production] | 
            
  | 19:06 | 
  <thcipriani@tin> | 
  rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.7 | 
  [production] | 
            
  | 18:46 | 
  <jynus> | 
  shutting down and reimaging db1001 | 
  [production] | 
            
  | 18:20 | 
  <papaul> | 
  ms-be202[3-7] - signing puppet certs, salt-key, initial run | 
  [production] | 
            
  | 17:23 | 
  <akosiaris> | 
  restart apache on ununpentium for m1 migration. Hosts RT, just did it for good measure | 
  [production] | 
            
  | 17:21 | 
  <akosiaris> | 
  restarted bacula-director on helium | 
  [production] | 
            
  | 17:15 | 
  <jynus> | 
  killing puppet, rt, librenms user connections on db1001 | 
  [production] | 
            
  | 17:10 | 
  <jynus> | 
  failovered m1-master from db1001 to db1016 | 
  [production] | 
            
  | 16:20 | 
  <gehel> | 
  new elasticsearch servers elastic1032-1047 are configured and have joined the eqiad cluster | 
  [production] | 
            
  | 15:26 | 
  <thcipriani@tin> | 
  Synchronized php-1.28.0-wmf.6/extensions/OATHAuth: SWAT: [[gerrit:295510|Fixup qrcode-generating js, to stop race condition.]] (duration: 00m 33s) | 
  [production] | 
            
  | 15:23 | 
  <thcipriani@tin> | 
  Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:295514|Improve style]] (duration: 00m 33s) | 
  [production] | 
            
  | 15:18 | 
  <thcipriani@tin> | 
  Synchronized php-1.28.0-wmf.7/extensions/OATHAuth: SWAT: [[gerrit:295511|Fixup qrcode-generating js, to stop race condition.]] (duration: 00m 27s) | 
  [production] | 
            
  | 15:13 | 
  <thcipriani@tin> | 
  Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:295478|Add www.wpc.ncep.noaa.gov to wgCopyUploadsDomains]] (duration: 00m 54s) | 
  [production] | 
            
  | 15:01 | 
  <elukey> | 
  rebooting bohrium.eqiad.wmnet (running piwik) for kernel upgrades | 
  [production] | 
            
  | 14:32 | 
  <jynus> | 
  checksumming m1 databases in preparation for failover | 
  [production] | 
            
  | 14:29 | 
  <tgr> | 
  running https://phabricator.wikimedia.org/diffusion/ECAU/browse/master/maintenance/checkLocalUser.php for some users T119736 | 
  [production] | 
            
  | 14:04 | 
  <moritzm> | 
  rolling restart of hhvm/apache on app servers in eqiad for expat security update | 
  [production] | 
            
  | 13:42 | 
  <godog> | 
  add 500G to fluorine /a (almost full) | 
  [production] | 
            
  | 13:31 | 
  <gehel> | 
  configuring new elasticsearch servers elastic1038-1042 in eqiad | 
  [production] | 
            
  | 13:03 | 
  <hashar> | 
  Manually moved some missing build records. Restarting Jenkins | 
  [production] | 
            
  | 12:49 | 
  <hashar> | 
  T80385 Restarting Jenkins with builds dir set to "${JENKINS_HOME}/builds/${ITEM_FULL_NAME}" which is /var/lib/jenkins/builds/XXX | 
  [production] | 
            
  | 12:35 | 
  <gehel> | 
  starting reimage of mw1292 | 
  [production] | 
            
  | 12:34 | 
  <_joe_> | 
  disabling puppet on mw1017, live-hacking it | 
  [production] | 
            
  | 12:34 | 
  <hashar> | 
  T80385 stopping Jenkins and migrating all build records to /var/lib/jenkins/builds | 
  [production] | 
            
  | 12:06 | 
  <gehel> | 
  configuring new elasticsearch servers elastic1033-1037 in eqiad | 
  [production] | 
            
  | 10:46 | 
  <godog> | 
  upload libphutil/arcanist 0~git20160620-0wmf1 to carbon | 
  [production] | 
            
  | 10:32 | 
  <elukey> | 
  mw1140 powercycle after freeze issues due to memory pressure (was not able to ssh to it) | 
  [production] | 
            
  | 10:18 | 
  <moritzm> | 
  rolling restart of restbase in eqiad to pick up firejail change in service::node | 
  [production] | 
            
  | 09:46 | 
  <moritzm> | 
  rolling restart of restbase in codfw to pick up firejail change in service::node | 
  [production] | 
            
  | 09:43 | 
  <legoktm> | 
  live-hacking on mw1017 to debug T115119 | 
  [production] | 
            
  | 09:19 | 
  <jynus> | 
  stopping and reconfiguring mysql on dbstore1001 | 
  [production] | 
            
  | 07:59 | 
  <moritzm> | 
  rolling restart of hhvm/apache on canary app servers in eqiad for expat security update | 
  [production] | 
            
  | 07:30 | 
  <jynus> | 
  stopping, backing up and reimaging db1061 and db1062 | 
  [production] | 
            
  | 07:06 | 
  <moritzm> | 
  restarted hhvm on mw1131 | 
  [production] | 
            
  | 04:29 | 
  <chasemp> | 
  fix salt key on labtestmetal2001 | 
  [production] | 
            
  | 03:12 | 
  <l10nupdate@tin> | 
  ResourceLoader cache refresh completed at Wed Jun 22 03:12:33 UTC 2016 (duration 6m 44s) | 
  [production] | 
            
  | 03:05 | 
  <mwdeploy@tin> | 
  scap sync-l10n completed (1.28.0-wmf.7) (duration: 17m 49s) | 
  [production] | 
            
  | 02:31 | 
  <mwdeploy@tin> | 
  scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 24s) | 
  [production] | 
            
  
    | 
      
        2016-06-21
      
      §
     | 
  
    
  | 23:14 | 
  <yurik> | 
  updated/restarted kartotherian & tilerator - https://gerrit.wikimedia.org/r/#/c/295440/ https://gerrit.wikimedia.org/r/#/c/295441/ | 
  [production] | 
            
  | 23:05 | 
  <tgr> | 
  deleted localuser rows for Mahir256@orwikisource and A879071@enwiki for T119736 | 
  [production] | 
            
  | 22:19 | 
  <bd808> | 
  Backfilled missing 2016-06-20 data to https://tools.wmflabs.org/sal/production?d=2016-06-20 | 
  [production] | 
            
  | 22:08 | 
  <ori@tin> | 
  Synchronized static/images/mobile: I8f09e825: Optimize mobile static images (duration: 00m 34s) | 
  [production] | 
            
  | 19:27 | 
  <bd808> | 
  Restarted dead logstash process on logstash1001. Looks to have stopped itself due to the the Elasticsearch OOM earlier | 
  [production] | 
            
  | 19:18 | 
  <thcipriani@tin> | 
  Purged l10n cache for 1.28.0-wmf.5 | 
  [production] | 
            
  | 19:17 | 
  <bd808> | 
  Restarted ElasticSearch on logstash1001; dead from OOM | 
  [production] | 
            
  | 19:14 | 
  <thcipriani@tin> | 
  rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.28.0-wmf.7 | 
  [production] | 
            
  | 18:50 | 
  <bblack> | 
  enabled tcp_notsent_lowat optimization on all caches (marking this time for investigation of perf graphs later) - https://gerrit.wikimedia.org/r/#/c/295376/ | 
  [production] | 
            
  | 17:16 | 
  <thcipriani@tin> | 
  Synchronized php-1.28.0-wmf.7/extensions/Graph/lib/graph2.compiled.js: pre-train backport: [[gerrit:295366|Updated to latest graph2 lib]] (duration: 00m 31s) | 
  [production] | 
            
  | 17:10 | 
  <yurik_> | 
  deployed graphoid https://gerrit.wikimedia.org/r/#/c/295367/ | 
  [production] |