Page MenuHomePhabricator

Configure dns and puppet repositories for new drmrs datacenter
Closed, ResolvedPublic

Description

Due: Q4 FY2021

drmrs is the codename selected for our new edge DC in Marseille.
185.15.58.0/24 is the public subnet for the new site.
The datacenter numeric code is 6 (vs e.g. 3 for esams or 5 for eqsin).

https://wikitech.wikimedia.org/wiki/Infrastructure_naming_conventions should probably be updated for drmrs as well!

Details

Subject Repo Branch Lines +/-
labs/private master +6 -0
operations/puppet production +2 -0
operations/puppet production +1 -0
operations/puppet production +25 -0
operations/puppet production +92 -5
operations/puppet production +3 -0
operations/puppet production +10 -1
operations/puppet production +6 -0
operations/puppet production +1 -0
operations/puppet production +7 -2
operations/puppet production +14 -18
operations/puppet production +8 -0
operations/puppet production +5 -0
operations/puppet production +17 -1
operations/puppet production +1 -1
operations/puppet production +2 -5
operations/puppet production +112 -3
operations/mediawiki-config master +7 -1
operations/puppet production +1 -1
operations/homer/public master +5 -1
operations/puppet production +8 -0
operations/puppet production +15 -16
operations/dns master +13 -0
operations/puppet production +13 -0
operations/dns master +1 -1
operations/puppet production +1 -1
operations/puppet production +6 -1
operations/dns master +0 -4
operations/dns master +1 -2
operations/puppet production +4 -4
operations/puppet production +6 -0
operations/puppet production +1 -1
operations/puppet production +2 -1
operations/puppet production +5 -0
operations/puppet production +6 -1
operations/puppet production +5 -0
operations/dns master +2 -2
operations/puppet production +1 -1
operations/puppet production +44 -0
operations/puppet production +26 -0
operations/puppet production +2 -0
operations/puppet production +21 -2
operations/puppet production +50 -2
operations/puppet production +7 -0
operations/software/pywmflib master +2 -1
operations/puppet production +4 -4
operations/puppet production +5 -0
operations/puppet production +24 -0
operations/puppet production +11 -3
operations/puppet production +2 -2
operations/puppet production +108 -0
operations/dns master +3 -0
operations/puppet production +4 -0
operations/puppet production +6 -3
operations/puppet production +2 -0
operations/puppet production +24 -0
operations/dns master +5 -1
operations/puppet production +1 -1
operations/puppet production +56 -0
operations/puppet production +0 -24
operations/puppet production +22 -0
operations/dns master +12 -12
operations/dns master +106 -1
operations/dns master +95 -1
operations/dns master +78 -1
operations/puppet production +2 -1
operations/puppet production +1 -4
operations/puppet production +4 -1
operations/puppet production +1 -0
operations/puppet production +5 -0
operations/puppet production +1 -0
operations/puppet production +1 -0
operations/puppet production +1 -1
operations/puppet production +1 -0
operations/puppet production +3 -1
operations/puppet production +1 -0
operations/puppet production +1 -0
operations/puppet production +1 -0
operations/puppet production +5 -0
operations/puppet production +1 -0
operations/puppet production +2 -0
operations/puppet production +2 -0
operations/puppet production +3 -1
operations/puppet production +1 -0
operations/puppet production +2 -0
Show related patches Customize query in gerrit

Related Objects

Status Subtype Assigned Task
Resolved BBlack
Resolved MMandere
Resolved cmooney

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 739553 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: define dual ganeti clusters

https://gerrit.wikimedia.org/r/739553

Change 739553 merged by BBlack:

[operations/puppet@production] drmrs: define dual ganeti clusters

https://gerrit.wikimedia.org/r/739553

Change 739584 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs ganeti: add cluster cert public keys

https://gerrit.wikimedia.org/r/739584

Change 739584 merged by BBlack:

[operations/puppet@production] drmrs ganeti: add cluster cert public keys

https://gerrit.wikimedia.org/r/739584

Change 739586 had a related patch set uploaded (by BBlack; author: BBlack):

[labs/private@master] Add dummy private keys for drmrs ganeti

https://gerrit.wikimedia.org/r/739586

Change 739588 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] ganeti6: switch to ganeti role

https://gerrit.wikimedia.org/r/739588

Change 739588 merged by BBlack:

[operations/puppet@production] ganeti6: switch to ganeti role

https://gerrit.wikimedia.org/r/739588

Change 739594 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/dns@master] drmrs: include netbox svc file

https://gerrit.wikimedia.org/r/739594

Change 739594 merged by BBlack:

[operations/dns@master] drmrs: include netbox svc file

https://gerrit.wikimedia.org/r/739594

Change 739757 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Add drmrs lvs instances

https://gerrit.wikimedia.org/r/739757

Change 739757 merged by MMandere:

[operations/puppet@production] site: Add drmrs lvs instances

https://gerrit.wikimedia.org/r/739757

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host lvs6001.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host lvs6001.drmrs.wmnet with OS buster completed:

  • lvs6001 (WARN)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202111181017_mmandere_20302_lvs6001.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> staged

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host lvs6002.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host lvs6003.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host lvs6002.drmrs.wmnet with OS buster completed:

  • lvs6002 (WARN)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202111181105_mmandere_29884_lvs6002.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> staged

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host lvs6003.drmrs.wmnet with OS buster completed:

  • lvs6003 (WARN)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202111181127_mmandere_31664_lvs6003.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> staged

Change 747856 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Add drmrs bastion host

https://gerrit.wikimedia.org/r/747856

Change 747856 merged by MMandere:

[operations/puppet@production] site: Add drmrs bastion host

https://gerrit.wikimedia.org/r/747856

Change 748125 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] bast6001: set dhcp macaddr for ganeti vm

https://gerrit.wikimedia.org/r/748125

Change 748125 merged by BBlack:

[operations/puppet@production] bast6001: set dhcp macaddr for ganeti vm

https://gerrit.wikimedia.org/r/748125

Change 748151 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] bast6001: add to bastion_hosts

https://gerrit.wikimedia.org/r/748151

Change 748151 merged by BBlack:

[operations/puppet@production] bast6001: add to bastion_hosts

https://gerrit.wikimedia.org/r/748151

Change 748174 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] install6001: add site.pp entry

https://gerrit.wikimedia.org/r/748174

Change 748175 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] install6001: use for drmrs installs

https://gerrit.wikimedia.org/r/748175

Change 748178 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/dns@master] install6001: use as proxy for drmrs

https://gerrit.wikimedia.org/r/748178

Change 748174 merged by BBlack:

[operations/puppet@production] install6001: add site.pp entry

https://gerrit.wikimedia.org/r/748174

Change 748182 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] install6001: dhcp entry

https://gerrit.wikimedia.org/r/748182

Change 748182 merged by BBlack:

[operations/puppet@production] install6001: dhcp entry

https://gerrit.wikimedia.org/r/748182

Change 748175 merged by BBlack:

[operations/puppet@production] install6001: use for drmrs installs

https://gerrit.wikimedia.org/r/748175

Change 748178 merged by BBlack:

[operations/dns@master] install6001: use as proxy for drmrs

https://gerrit.wikimedia.org/r/748178

Change 748215 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/dns@master] drmrs: remove fake prometheus6001 dns entry

https://gerrit.wikimedia.org/r/748215

Change 748215 merged by BBlack:

[operations/dns@master] drmrs: remove fake prometheus6001 dns entry

https://gerrit.wikimedia.org/r/748215

Change 748224 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] prometheus6001: macaddr and site.pp

https://gerrit.wikimedia.org/r/748224

Change 748225 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] prometheus6001: add to global node list

https://gerrit.wikimedia.org/r/748225

Change 748224 merged by BBlack:

[operations/puppet@production] prometheus6001: macaddr and site.pp

https://gerrit.wikimedia.org/r/748224

Change 748225 merged by BBlack:

[operations/puppet@production] prometheus6001: add to global node list

https://gerrit.wikimedia.org/r/748225

Change 748227 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/dns@master] Add prometheus.svc.drmrs.wmnet alias

https://gerrit.wikimedia.org/r/748227

Change 748227 merged by BBlack:

[operations/dns@master] Add prometheus.svc.drmrs.wmnet alias

https://gerrit.wikimedia.org/r/748227

Change 748228 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] Add drmrs prometheus to various global config

https://gerrit.wikimedia.org/r/748228

Change 748228 merged by BBlack:

[operations/puppet@production] Add drmrs prometheus to various global config

https://gerrit.wikimedia.org/r/748228

Change 748728 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/dns@master] drmrs: include Netbox files for LVS subnets

https://gerrit.wikimedia.org/r/748728

Change 748728 merged by BBlack:

[operations/dns@master] drmrs: include Netbox files for LVS subnets

https://gerrit.wikimedia.org/r/748728

Change 748746 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: configure ats-tls params

https://gerrit.wikimedia.org/r/748746

Change 748747 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] cloudgw: add newly-allocated drmrs IPs

https://gerrit.wikimedia.org/r/748747

Change 748752 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: configure lvs and public IPs

https://gerrit.wikimedia.org/r/748752

Change 748757 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: add to global datacenter list

https://gerrit.wikimedia.org/r/748757

Change 748746 merged by BBlack:

[operations/puppet@production] drmrs: configure ats-tls params

https://gerrit.wikimedia.org/r/748746

Change 748747 merged by BBlack:

[operations/puppet@production] cloudgw: add newly-allocated drmrs IPs

https://gerrit.wikimedia.org/r/748747

Change 748775 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/homer/public@master] Add drmrs addresses

https://gerrit.wikimedia.org/r/748775

Change 748790 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: ncredir puppetization

https://gerrit.wikimedia.org/r/748790

Change 748775 merged by jenkins-bot:

[operations/homer/public@master] Add drmrs addresses

https://gerrit.wikimedia.org/r/748775

Change 751952 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/mediawiki-config@master] reverse-proxy: add drmrs ranges

https://gerrit.wikimedia.org/r/751952

Change 751952 merged by jenkins-bot:

[operations/mediawiki-config@master] reverse-proxy: add drmrs ranges

https://gerrit.wikimedia.org/r/751952

Mentioned in SAL (#wikimedia-operations) [2022-01-11T14:25:36Z] <taavi@deploy1002> Synchronized wmf-config/reverse-proxy.php: Config: [[gerrit:751952|reverse-proxy: add drmrs ranges (T282787)]] (duration: 01m 36s)

Change 748752 merged by MMandere:

[operations/puppet@production] drmrs: lvs/cp puppetization

https://gerrit.wikimedia.org/r/748752

Change 756613 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Add drmrs ncredir host

https://gerrit.wikimedia.org/r/756613

Change 756627 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: connect lvs bgp to switches

https://gerrit.wikimedia.org/r/756627

Change 756627 merged by BBlack:

[operations/puppet@production] drmrs host bgp fixups

https://gerrit.wikimedia.org/r/756627

Change 756639 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] bird anycast: fix defaulting to local gateway

https://gerrit.wikimedia.org/r/756639

Change 756639 merged by BBlack:

[operations/puppet@production] bird anycast: fix defaulting to local gateway

https://gerrit.wikimedia.org/r/756639

Change 756613 merged by MMandere:

[operations/puppet@production] site: Add drmrs ncredir host

https://gerrit.wikimedia.org/r/756613

Change 756953 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] install_server: Add drmrs ncredir first instance

https://gerrit.wikimedia.org/r/756953

Change 756953 merged by MMandere:

[operations/puppet@production] install_server: Add drmrs ncredir first instance

https://gerrit.wikimedia.org/r/756953

Change 757024 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] install_server: Add drmrs ncredir second instance

https://gerrit.wikimedia.org/r/757024

Change 757024 merged by MMandere:

[operations/puppet@production] install_server: Add drmrs ncredir second instance

https://gerrit.wikimedia.org/r/757024

Change 748790 abandoned by BBlack:

[operations/puppet@production] drmrs: ncredir puppetization

Reason:

already done elsewhere

https://gerrit.wikimedia.org/r/748790

Change 748757 merged by BBlack:

[operations/puppet@production] drmrs: various minor global config

https://gerrit.wikimedia.org/r/748757

Change 760613 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] Add netflow6001 to kafka custom ferm

https://gerrit.wikimedia.org/r/760613

Change 760614 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] Add ops-drmrs to alertmanager config

https://gerrit.wikimedia.org/r/760614

Change 760615 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: add vk delivery error alerting

https://gerrit.wikimedia.org/r/760615

Change 760613 merged by BBlack:

[operations/puppet@production] Add netflow6001 to kafka custom ferm

https://gerrit.wikimedia.org/r/760613

Change 760614 merged by BBlack:

[operations/puppet@production] Add ops-drmrs to alertmanager config

https://gerrit.wikimedia.org/r/760614

Change 760615 merged by BBlack:

[operations/puppet@production] drmrs: add vk delivery error alerting

https://gerrit.wikimedia.org/r/760615

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6009.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6009.drmrs.wmnet with OS buster completed:

  • cp6009 (WARN)
    • Downtimed on Icinga
    • Set pooled=inactive for the following services on confctl:

{"cp6009.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6009.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6009.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203151810_sukhe_1359834_cp6009.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6010.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6010.drmrs.wmnet with OS buster completed:

  • cp6010 (WARN)
    • Downtimed on Icinga
    • Set pooled=inactive for the following services on confctl:

{"cp6010.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6010.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6010.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203152026_sukhe_1375326_cp6010.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6011.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6011.drmrs.wmnet with OS buster completed:

  • cp6011 (WARN)
    • Downtimed on Icinga
    • Set pooled=inactive for the following services on confctl:

{"cp6011.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6011.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6011.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203160011_sukhe_1403513_cp6011.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6012.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6012.drmrs.wmnet with OS buster completed:

  • cp6012 (WARN)
    • Downtimed on Icinga
    • Set pooled=inactive for the following services on confctl:

{"cp6012.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6012.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6012.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203161108_sukhe_1482638_cp6012.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6013.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6013.drmrs.wmnet with OS buster completed:

  • cp6013 (WARN)
    • Downtimed on Icinga
    • Set pooled=inactive for the following services on confctl:

{"cp6013.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6013.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6013.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203161227_sukhe_1494418_cp6013.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6014.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6014.drmrs.wmnet with OS buster completed:

  • cp6014 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Set pooled=inactive for the following services on confctl:

{"cp6014.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6014.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6014.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • Removed previous downtime on Alertmanager (old OS)
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203161357_sukhe_1509133_cp6014.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6015.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6015.drmrs.wmnet with OS buster completed:

  • cp6015 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Set pooled=inactive for the following services on confctl:

{"cp6015.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6015.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6015.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • Removed previous downtime on Alertmanager (old OS)
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203161446_sukhe_1518579_cp6015.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6016.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6016.drmrs.wmnet with OS buster completed:

  • cp6016 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Set pooled=inactive for the following services on confctl:

{"cp6016.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6016.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6016.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • Removed previous downtime on Alertmanager (old OS)
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203161607_sukhe_1531550_cp6016.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

With the addition of the drmrs to the dns config in https://gerrit.wikimedia.org/r/c/operations/dns/+/771342 we're basically done with the task work here. There may be further commits, but they're in the normal production flow, not initial site config!

Nice work all!

Change 734245 abandoned by Muehlenhoff:

[operations/puppet@production] cumin: Add drmrs DC site

Reason:

Obsolete

https://gerrit.wikimedia.org/r/734245

Change 692869 abandoned by BBlack:

[operations/puppet@production] Add drmrs site instances

Reason:

Superseded by other patches months ago

https://gerrit.wikimedia.org/r/692869

Change 692331 abandoned by BBlack:

[operations/puppet@production] conftool-data/node: Add drmrs nodes

Reason:

Superseded by other patches months ago

https://gerrit.wikimedia.org/r/692331

Change 692332 abandoned by BBlack:

[operations/puppet@production] hieradata: Add drmrs domain to puppet master allow list

Reason:

Superseded by other patches months ago

https://gerrit.wikimedia.org/r/692332

Change 692333 abandoned by BBlack:

[operations/puppet@production] hieradata/cloud: Add drmrs to ntp peers list

Reason:

Superseded by other patches months ago

https://gerrit.wikimedia.org/r/692333